I-Vector DNN Scoring and Calibration for Noise Robust Speaker Verification

被引:2
|
作者
Tan, Zhili [1 ]
Mak, Man-Wai [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Peoples R China
关键词
Deep learning; speaker verification; score calibration; multi-task learning; noise robustness; PLDA;
D O I
10.21437/Interspeech.2017-656
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes applying multi-task learning to train deep neural networks (DNNs) for calibrating the PLDA scores of speaker verification systems under noisy environments. To facilitate the DNNs to learn the main task (calibration). several auxiliary tasks were introduced, including the prediction of SNR and duration from i-vectors and classifying whether an i-vector pair belongs to the same speaker or not. The possibility of replacing the PLDA model by a DNN during the scoring stage is also explored. Evaluations on noise contaminated speech suggest that the auxiliary tasks are important for the DNNs to learn the main calibration task and that the uncalibrated PLDA scores are an essential input to the DNNs. Without this input, the DNNs can only predict the score shifts accurately. suggesting that the PLDA model is indispensable.
引用
收藏
页码:1562 / 1566
页数:5
相关论文
共 50 条
  • [31] SPEAKER VERIFICATION USING SIMPLIFIED AND SUPERVISED I-VECTOR MODELING
    Li, Ming
    Tsiartas, Andreas
    Van Segbroeck, Maarten
    Narayanan, Shrikanth S.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7199 - 7203
  • [32] On the Complementary Role of DNN Multi-Level Enhancement for Noisy Robust Speaker Recognition in an I-Vector Framework
    Zhang, Xingyu
    Zou, Xia
    Sun, Meng
    Wu, Penglong
    Wang, Yimin
    He, Jun
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2020, E103A (01) : 356 - 360
  • [33] Minimax i-vector extractor for short duration speaker verification
    Hautamaki, Ville
    Cheng, You-Chi
    Rajan, Padmanabhan
    Lee, Chin-Hui
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3675 - 3679
  • [34] Non-linear PLDA for i-Vector Speaker Verification
    Novoselov, Sergey
    Pekhovsky, Timur
    Kudashev, Oleg
    Mendelev, Valentin
    Prudnikov, Alexey
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 214 - 218
  • [35] Bayesian Distance Metric Learning on i-vector for Speaker Verification
    Fang, Xiao
    Dehak, Najim
    Glass, James
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2513 - 2517
  • [36] Noise Robust Speaker Recognition Based on Adaptive Frame Weighting in GMM for i-Vector Extraction
    Zhang, Xingyu
    Zou, Xia
    Sun, Meng
    Zheng, Thomas Fang
    Jia, Chong
    Wang, Yimin
    IEEE ACCESS, 2019, 7 : 27874 - 27882
  • [37] ADDITIVE NOISE COMPENSATION IN THE I-VECTOR SPACE FOR SPEAKER RECOGNITION
    Ben Kheder, Waad
    Matrouf, Driss
    Bonastre, Jean-Francois
    Ajili, Moez
    Bousquet, Pierre-Michel
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4190 - 4194
  • [38] SIMPLIFIED VTS-BASED I-VECTOR EXTRACTION IN NOISE-ROBUST SPEAKER RECOGNITION
    Lei, Yun
    McLaren, Mitchell
    Ferrer, Luciana
    Scheffer, Nicolas
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [39] Double Joint Bayesian Modeling of DNN Local I-Vector for Text Dependent Speaker Verification with Random Digit Strings
    Shi, Ziqiang
    Lin, Huibin
    Liu, Liu
    Liu, Rujie
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 67 - 71
  • [40] Effect of long-term ageing on i-vector speaker verification
    Kelly, Finnian
    Saeidi, Rahim
    Harte, Naomi
    van Leeuwen, David
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 86 - 90