I-Vector DNN Scoring and Calibration for Noise Robust Speaker Verification

被引:2
|
作者
Tan, Zhili [1 ]
Mak, Man-Wai [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Peoples R China
关键词
Deep learning; speaker verification; score calibration; multi-task learning; noise robustness; PLDA;
D O I
10.21437/Interspeech.2017-656
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes applying multi-task learning to train deep neural networks (DNNs) for calibrating the PLDA scores of speaker verification systems under noisy environments. To facilitate the DNNs to learn the main task (calibration). several auxiliary tasks were introduced, including the prediction of SNR and duration from i-vectors and classifying whether an i-vector pair belongs to the same speaker or not. The possibility of replacing the PLDA model by a DNN during the scoring stage is also explored. Evaluations on noise contaminated speech suggest that the auxiliary tasks are important for the DNNs to learn the main calibration task and that the uncalibrated PLDA scores are an essential input to the DNNs. Without this input, the DNNs can only predict the score shifts accurately. suggesting that the PLDA model is indispensable.
引用
收藏
页码:1562 / 1566
页数:5
相关论文
共 50 条
  • [1] SPEAKER DIARIZATION WITH PLDA I-VECTOR SCORING AND UNSUPERVISED CALIBRATION
    Sell, Gregory
    Garcia-Romero, Daniel
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 413 - 417
  • [2] Noise Compensation in i-vector Space Using Linear Regression for Robust Speaker Verification
    Baby, Renjith
    Kumar, C. Santhosh
    George, Kuruvachan K.
    Panda, Ashish
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON MULTIMEDIA, SIGNAL PROCESSING AND COMMUNICATION TECHNOLOGIES (IMPACT), 2017, : 161 - 165
  • [3] A Segmental DNN/i-vector Approach for Digit-Prompted Speaker Verification
    Yan, Jie
    Lei, Xie
    Wang, Guangsen
    Fu, Zhong-Hua
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1 - 5
  • [4] Fast Scoring for Mixture of PLDA in I-Vector/PLDA Speaker Verification
    Mak, Man-Wai
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 587 - 593
  • [5] An I-Vector Backend for Speaker Verification
    Kenny, Patrick
    Stafylakis, Themos
    Alam, Jahangir
    Kockmann, Marcel
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2307 - 2311
  • [6] IMPROVING DNN SPEAKER INDEPENDENCE WITH I-VECTOR INPUTS
    Senior, Andrew
    Lopez-Moreno, Ignacio
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [7] DNN i-vector Speaker Verification with Short, Text-constrained Test Utterances
    Zhong, Jinghua
    Hu, Wenping
    Soong, Frank
    Meng, Helen
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1507 - 1511
  • [8] A NOISE ROBUST I-VECTOR EXTRACTOR USING VECTOR TAYLOR SERIES FOR SPEAKER RECOGNITION
    Lei, Yun
    Burget, Lukas
    Scheffer, Nicolas
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6788 - 6791
  • [9] DNN-Based Score Calibration With Multitask Learning for Noise Robust Speaker Verification
    Tan, Zhili
    Mak, Man-Wai
    Mak, Brian Kan-Wing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (04) : 700 - 712
  • [10] DNN FEATURE COMPENSATION FOR NOISE ROBUST SPEAKER VERIFICATION
    Du, Steven
    Xiao, Xiong
    Chng, Eng Siong
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 871 - 875