Optimized Active Learning Strategy for Audiovisual Speaker Recognition

被引：4

作者：

Karlos, Stamatis ^{[1
]}

Kaleris, Konstantinos ^{[2
]}

Fazakis, Nikos ^{[2
]}

Kanas, Vasileios G. ^{[2
]}

Kotsiantis, Sotiris ^{[1
]}

机构：

[1] Univ Patras, Dept Math, Rion 26504, Achaia, Greece

[2] Univ Patras, Dept Elect & Engn, Rion 26504, Achaia, Greece

来源：

SPEECH AND COMPUTER (SPECOM 2018) | 2018年 / 11096卷

关键词：

Active Learning; Optimized learner; Speaker Recognition; Audiovisual features; Support Vector Machines; Hyperopt package tool; SPEECH; EXTRACTION;

D O I：

10.1007/978-3-319-99579-3_30

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The purpose of this work is to investigate the improved recognition accuracy caused from exploiting optimization stages for tuning parameters of an Active Learning (AL) classifier. Since plenty of data could be available during Speaker Recognition (SR) tasks, the AL concept, which incorporates human entities inside its learning kernel for exploring hidden insights into unlabeled data, seems extremely suitable, without demanding much expertise on behalf of the human factor. Six datasets containing 8 and 16 speakers' utterances under different recording setups, are described by audiovisual features and evaluated through the time-efficient Uncertainty Sampling query strategy (UncS). Both Support Vector Machines (SVMs) and Random Forest (RF) algorithms were selected to be tuned over a small subset of the initial training data and then applied iteratively for mining the most suitable instances from a corresponding pool of unlabeled instances. Useful conclusions are drawn concerning the values of the selected parameters, allowing future optimization attempts to get employed into more restricted regions, while remarkable improvements rates were obtained using an ideal annotator.

引用

页码：281 / 290

页数：10

共 50 条

[31] Deep learning methods in speaker recognition: A review
Sztahó D.
Szaszák G.
Beke A.
Periodica polytechnica Electrical engineering and computer science, 2021, 65 (04): : 310 - 328
[32] Disentangled Representation Learning for Multilingual Speaker Recognition
Nam, Kihyun
Kim, Youkyum
Huh, Jaesung
Heo, Hee-Soo
Jung, Jee-weon
Chung, Joon Son
INTERSPEECH 2023, 2023, : 5316 - 5320
[33] Learning statistically efficient features for speaker recognition
Jang, GJ
Lee, TW
Oh, YH
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 437 - 440
[34] Learning statistically efficient features for speaker recognition
Jang, GJ
Lee, TW
Oh, YH
NEUROCOMPUTING, 2002, 49 : 329 - 348
[35] Latent discriminative representation learning for speaker recognition
Huang, Duolin
Mao, Qirong
Ma, Zhongchen
Zheng, Zhishen
Routryar, Sidheswar
Ocquaye, Elias-Nii-Noi
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2021, 22 (05) : 697 - 708
[36] Speaker Recognition with Deep Learning Approaches: A Review
Alenizi, Abdulrahman S.
Al-Karawi, Khamis A.
PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 5, ICICT 2024, 2024, 1000 : 481 - 499
[37] An extreme learning machine approach for speaker recognition
Yuan Lan
Zongjiang Hu
Yeng Chai Soh
Guang-Bin Huang
Neural Computing and Applications, 2013, 22 : 417 - 425
[38] An optimized facial recognition model for identifying criminal activities using deep learning strategy
Gokulakrishnan S.
Chakrabarti P.
Hung B.T.
Shankar S.S.
International Journal of Information Technology, 2023, 15 (7) : 3907 - 3921
[39] Deep learning for depression recognition with audiovisual cues: A review
He, Lang
Niu, Mingyue
Tiwari, Prayag
Marttinen, Pekka
Su, Rui
Jiang, Jiewei
Guo, Chenguang
Wang, Hongyu
Ding, Songtao
Wang, Zhongmin
Pan, Xiaoying
Dang, Wei
INFORMATION FUSION, 2022, 80 : 56 - 86
[40] Optimized speaker independent speech recognition system for low cost application
Teh, CC
Jong, CC
Siek, L
Loa, KK
PROCEEDINGS OF THE 43RD IEEE MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS I-III, 2000, : 1218 - 1221

← 1 2 3 4 5 →