DEEP NEURAL NETWORK DRIVEN MIXTURE OF PLDA FOR ROBUST I-VECTOR SPEAKER VERIFICATION

被引:0
|
作者
Li, Na [1 ]
Mak, Man-Wai [1 ]
Chien, Jen-Tzung [2 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Hong Kong, Peoples R China
[2] Natl Chiao Tung Univ, Dept Elect & Comp Engn, Hsinchu, Taiwan
关键词
Speaker verification; i-vector; mixture of PLDA; deep neural networks; SNR mismatch;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In speaker recognition, the mismatch between the enrollment and test utterances due to noise with different signal-to-noise ratios (SNRs) is a great challenge. Based on the observation that noise-level variability causes the i-vectors to form heterogeneous clusters, this paper proposes using an SNR-aware deep neural network (DNN) to guide the training of PLDA mixture models. Specifically, given an i-vector, the SNR posterior probabilities produced by the DNN are used as the posteriors of indicator variables of the mixture model. As a result, the proposed model provides a more reasonable soft division of the i-vector space compared to the conventional mixture of PLDA. During verification, given a test trial, the marginal likelihoods from individual PLDA models are linearly combined by the posterior probabilities of SNR levels computed by the DNN. Experimental results for SNR mismatch tasks based on NIST 2012 SRE suggest that the proposed model is more effective than PLDA and conventional mixture of PLDA for handling heterogeneous corpora.
引用
收藏
页码:186 / 191
页数:6
相关论文
共 50 条
  • [41] Cosine Metric Learning for Speaker Verification in the i-Vector Space
    Bai, Zhong
    Zhang, Xiao-Lei
    Chen, Jingdong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1126 - 1130
  • [42] Geometric Discriminant Analysis for I-vector Based Speaker Verification
    Xu, Can
    Chen, Xianhong
    He, Liang
    Liu, Jia
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1636 - 1640
  • [43] Bayesian Principal Component Analysis for I-Vector Speaker Verification
    Rong Y.-F.
    Chen C.
    Chen D.-Y.
    He Y.-J.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2021, 49 (11): : 2186 - 2194
  • [44] WEIGHTED LDA TECHNIQUES FOR I-VECTOR BASED SPEAKER VERIFICATION
    Kanagasundaram, A.
    Dean, D.
    Vogt, R.
    McLaren, M.
    Sridharan, S.
    Mason, M.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4781 - 4784
  • [45] Simplified supervised i-vector modeling with application to robust and efficient language identification and speaker verification
    Li, Ming
    Narayanan, Shrikanth
    COMPUTER SPEECH AND LANGUAGE, 2014, 28 (04): : 940 - 958
  • [46] PERFORMANCE OF I-VECTOR SPEAKER VERIFICATION AND THE DETECTION OF SYNTHETIC SPEECH
    McClanahan, Richard D.
    Stewart, Bryan
    De Leon, Phillip L.
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [47] SPEAKER VERIFICATION USING SIMPLIFIED AND SUPERVISED I-VECTOR MODELING
    Li, Ming
    Tsiartas, Andreas
    Van Segbroeck, Maarten
    Narayanan, Shrikanth S.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7199 - 7203
  • [48] Minimax i-vector extractor for short duration speaker verification
    Hautamaki, Ville
    Cheng, You-Chi
    Rajan, Padmanabhan
    Lee, Chin-Hui
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3675 - 3679
  • [49] Bayesian Distance Metric Learning on i-vector for Speaker Verification
    Fang, Xiao
    Dehak, Najim
    Glass, James
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2513 - 2517
  • [50] Robust i-vector extraction for neural network adaptation in noisy environment
    Yu, Chengzhu
    Ogawa, Atsunori
    Delcroix, Marc
    Yoshioka, Takuya
    Nakatani, Tomohiro
    Hansen, John H. L.
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2854 - 2857