DEEP NEURAL NETWORK DRIVEN MIXTURE OF PLDA FOR ROBUST I-VECTOR SPEAKER VERIFICATION

被引:0
|
作者
Li, Na [1 ]
Mak, Man-Wai [1 ]
Chien, Jen-Tzung [2 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Hong Kong, Peoples R China
[2] Natl Chiao Tung Univ, Dept Elect & Comp Engn, Hsinchu, Taiwan
关键词
Speaker verification; i-vector; mixture of PLDA; deep neural networks; SNR mismatch;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In speaker recognition, the mismatch between the enrollment and test utterances due to noise with different signal-to-noise ratios (SNRs) is a great challenge. Based on the observation that noise-level variability causes the i-vectors to form heterogeneous clusters, this paper proposes using an SNR-aware deep neural network (DNN) to guide the training of PLDA mixture models. Specifically, given an i-vector, the SNR posterior probabilities produced by the DNN are used as the posteriors of indicator variables of the mixture model. As a result, the proposed model provides a more reasonable soft division of the i-vector space compared to the conventional mixture of PLDA. During verification, given a test trial, the marginal likelihoods from individual PLDA models are linearly combined by the posterior probabilities of SNR levels computed by the DNN. Experimental results for SNR mismatch tasks based on NIST 2012 SRE suggest that the proposed model is more effective than PLDA and conventional mixture of PLDA for handling heterogeneous corpora.
引用
收藏
页码:186 / 191
页数:6
相关论文
共 50 条
  • [21] MULTICONDITION TRAINING OF GAUSSIAN PLDA MODELS IN I-VECTOR SPACE FOR NOISE AND REVERBERATION ROBUST SPEAKER RECOGNITION
    Garcia-Romero, Daniel
    Zhou, Xinhui
    Espy-Wilson, Carol Y.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4257 - 4260
  • [22] Deep Discriminant Analysis for i-vector Based Robust Speaker Recognition
    Wang, Shuai
    Huang, Zili
    Qian, Yanmin
    Yu, Kai
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 195 - 199
  • [23] NON-PARALLEL VOICE CONVERSION USING I-VECTOR PLDA: TOWARDS UNIFYING SPEAKER VERIFICATION AND TRANSFORMATION
    Kinnunen, Tomi
    Juvela, Lauri
    Alku, Paavo
    Yamagishi, Junichi
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5535 - 5539
  • [24] Pairwise Discriminative Speaker Verification in the I-Vector Space
    Cumani, Sandro
    Bruemmer, Niko
    Burget, Lukas
    Laface, Pietro
    Plchot, Oldrich
    Vasilakakis, Vasileios
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (06): : 1217 - 1227
  • [25] Feature Switching in the i-vector Framework for Speaker Verification
    Asha, T.
    Saranya, M. S.
    Pandia, Karthik D. S.
    Madikeri, Srikanth
    Murthy, Hema A.
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1125 - 1129
  • [26] Joint Speaker Verification and Antispoofing in the i-Vector Space
    Sizov, Aleksandr
    Khoury, Elie
    Kinnunen, Tomi
    Wu, Zhizheng
    Marcel, Sebastien
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2015, 10 (04) : 821 - 832
  • [27] I-VECTOR/PLDA SPEAKER RECOGNITION USING SUPPORT VECTORS WITH DISCRIMINANT ANALYSIS
    Bahmaninezhad, Fahimeh
    Hansen, John H. L.
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5410 - 5414
  • [28] i-Vector with sparse representation classification for speaker verification
    Kua, Jia Min Karen
    Epps, Julien
    Ambikairajah, Eliathamby
    SPEECH COMMUNICATION, 2013, 55 (05) : 707 - 720
  • [29] FAST DISCRIMINATIVE SPEAKER VERIFICATION IN THE I-VECTOR SPACE
    Cumani, Sandro
    Bruemmer, Niko
    Burget, Lukas
    Laface, Pietro
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4852 - 4855
  • [30] An improved i-vector extraction algorithm for speaker verification
    Wei Li
    Tianfan Fu
    Jie Zhu
    EURASIP Journal on Audio, Speech, and Music Processing, 2015