Optimized Active Learning Strategy for Audiovisual Speaker Recognition

被引:4
|
作者
Karlos, Stamatis [1 ]
Kaleris, Konstantinos [2 ]
Fazakis, Nikos [2 ]
Kanas, Vasileios G. [2 ]
Kotsiantis, Sotiris [1 ]
机构
[1] Univ Patras, Dept Math, Rion 26504, Achaia, Greece
[2] Univ Patras, Dept Elect & Engn, Rion 26504, Achaia, Greece
来源
关键词
Active Learning; Optimized learner; Speaker Recognition; Audiovisual features; Support Vector Machines; Hyperopt package tool; SPEECH; EXTRACTION;
D O I
10.1007/978-3-319-99579-3_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The purpose of this work is to investigate the improved recognition accuracy caused from exploiting optimization stages for tuning parameters of an Active Learning (AL) classifier. Since plenty of data could be available during Speaker Recognition (SR) tasks, the AL concept, which incorporates human entities inside its learning kernel for exploring hidden insights into unlabeled data, seems extremely suitable, without demanding much expertise on behalf of the human factor. Six datasets containing 8 and 16 speakers' utterances under different recording setups, are described by audiovisual features and evaluated through the time-efficient Uncertainty Sampling query strategy (UncS). Both Support Vector Machines (SVMs) and Random Forest (RF) algorithms were selected to be tuned over a small subset of the initial training data and then applied iteratively for mining the most suitable instances from a corresponding pool of unlabeled instances. Useful conclusions are drawn concerning the values of the selected parameters, allowing future optimization attempts to get employed into more restricted regions, while remarkable improvements rates were obtained using an ideal annotator.
引用
收藏
页码:281 / 290
页数:10
相关论文
共 50 条
  • [1] Efficient Audiovisual Fusion for Active Speaker Detection
    Tesema, Fiseha B.
    Gu, Jason
    Song, Wei
    Wu, Hong
    Zhu, Shiqiang
    Lin, Zheyuan
    IEEE ACCESS, 2023, 11 : 45140 - 45153
  • [2] Phonetically optimized speaker modeling for robust speaker recognition
    Lee, Bong-Jin
    Choi, Jeung-Yoon
    Kang, Hong-Goo
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 126 (03): : EL100 - EL106
  • [3] Limited Labels for Unlimited Data: Active Learning for Speaker Recognition
    Shum, Stephen H.
    Dehak, Najim
    Glass, James R.
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 383 - 387
  • [4] Active set strategy of optimized extreme learning machine
    Ding, Xiao-Jian
    Chang, Bao-Fang
    CHINESE SCIENCE BULLETIN, 2014, 59 (31): : 4152 - 4160
  • [5] Active set strategy of optimized extreme learning machine
    Xiao-Jian Ding
    Bao-Fang Chang
    Chinese Science Bulletin, 2014, 59 (31) : 4152 - 4160
  • [6] The Optimized Dictionary based Robust Speaker Recognition
    You, Datao
    Qiao, Baojun
    Li, Jie
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2017, 86 (2-3): : 289 - 297
  • [7] A new genetically optimized GMM for speaker recognition
    Lin, Lin
    Wang, Shuxun
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 704 - 704
  • [8] The Optimized Dictionary based Robust Speaker Recognition
    Datao You
    Baojun Qiao
    Jie Li
    Journal of Signal Processing Systems, 2017, 86 : 289 - 297
  • [9] Learning to Fool the Speaker Recognition
    Li, Jiguo
    Zhang, Xinfeng
    Xu, Jizheng
    Ma, Siwei
    Gao, Wen
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (03)
  • [10] LEARNING TO FOOL THE SPEAKER RECOGNITION
    Li, Jiguo
    Zhang, Xinfeng
    Xu, Jizheng
    Zhang, Li
    Wang, Yue
    Ma, Siwei
    Gao, Wen
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2937 - 2941