Optimized Active Learning Strategy for Audiovisual Speaker Recognition

被引:4
|
作者
Karlos, Stamatis [1 ]
Kaleris, Konstantinos [2 ]
Fazakis, Nikos [2 ]
Kanas, Vasileios G. [2 ]
Kotsiantis, Sotiris [1 ]
机构
[1] Univ Patras, Dept Math, Rion 26504, Achaia, Greece
[2] Univ Patras, Dept Elect & Engn, Rion 26504, Achaia, Greece
来源
关键词
Active Learning; Optimized learner; Speaker Recognition; Audiovisual features; Support Vector Machines; Hyperopt package tool; SPEECH; EXTRACTION;
D O I
10.1007/978-3-319-99579-3_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The purpose of this work is to investigate the improved recognition accuracy caused from exploiting optimization stages for tuning parameters of an Active Learning (AL) classifier. Since plenty of data could be available during Speaker Recognition (SR) tasks, the AL concept, which incorporates human entities inside its learning kernel for exploring hidden insights into unlabeled data, seems extremely suitable, without demanding much expertise on behalf of the human factor. Six datasets containing 8 and 16 speakers' utterances under different recording setups, are described by audiovisual features and evaluated through the time-efficient Uncertainty Sampling query strategy (UncS). Both Support Vector Machines (SVMs) and Random Forest (RF) algorithms were selected to be tuned over a small subset of the initial training data and then applied iteratively for mining the most suitable instances from a corresponding pool of unlabeled instances. Useful conclusions are drawn concerning the values of the selected parameters, allowing future optimization attempts to get employed into more restricted regions, while remarkable improvements rates were obtained using an ideal annotator.
引用
收藏
页码:281 / 290
页数:10
相关论文
共 50 条
  • [31] Deep learning methods in speaker recognition: A review
    Sztahó D.
    Szaszák G.
    Beke A.
    Periodica polytechnica Electrical engineering and computer science, 2021, 65 (04): : 310 - 328
  • [32] Disentangled Representation Learning for Multilingual Speaker Recognition
    Nam, Kihyun
    Kim, Youkyum
    Huh, Jaesung
    Heo, Hee-Soo
    Jung, Jee-weon
    Chung, Joon Son
    INTERSPEECH 2023, 2023, : 5316 - 5320
  • [33] Learning statistically efficient features for speaker recognition
    Jang, GJ
    Lee, TW
    Oh, YH
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 437 - 440
  • [34] Learning statistically efficient features for speaker recognition
    Jang, GJ
    Lee, TW
    Oh, YH
    NEUROCOMPUTING, 2002, 49 : 329 - 348
  • [35] Latent discriminative representation learning for speaker recognition
    Huang, Duolin
    Mao, Qirong
    Ma, Zhongchen
    Zheng, Zhishen
    Routryar, Sidheswar
    Ocquaye, Elias-Nii-Noi
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2021, 22 (05) : 697 - 708
  • [36] Speaker Recognition with Deep Learning Approaches: A Review
    Alenizi, Abdulrahman S.
    Al-Karawi, Khamis A.
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 5, ICICT 2024, 2024, 1000 : 481 - 499
  • [37] An extreme learning machine approach for speaker recognition
    Yuan Lan
    Zongjiang Hu
    Yeng Chai Soh
    Guang-Bin Huang
    Neural Computing and Applications, 2013, 22 : 417 - 425
  • [38] An optimized facial recognition model for identifying criminal activities using deep learning strategy
    Gokulakrishnan S.
    Chakrabarti P.
    Hung B.T.
    Shankar S.S.
    International Journal of Information Technology, 2023, 15 (7) : 3907 - 3921
  • [39] Deep learning for depression recognition with audiovisual cues: A review
    He, Lang
    Niu, Mingyue
    Tiwari, Prayag
    Marttinen, Pekka
    Su, Rui
    Jiang, Jiewei
    Guo, Chenguang
    Wang, Hongyu
    Ding, Songtao
    Wang, Zhongmin
    Pan, Xiaoying
    Dang, Wei
    INFORMATION FUSION, 2022, 80 : 56 - 86
  • [40] Optimized speaker independent speech recognition system for low cost application
    Teh, CC
    Jong, CC
    Siek, L
    Loa, KK
    PROCEEDINGS OF THE 43RD IEEE MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS I-III, 2000, : 1218 - 1221