DNN FEATURE COMPENSATION FOR NOISE ROBUST SPEAKER VERIFICATION

被引:0
|
作者
Du, Steven [1 ,2 ]
Xiao, Xiong [2 ]
Chng, Eng Siong [1 ,2 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore
[2] Nanyang Technol Univ, Temasek Labs, Singapore, Singapore
关键词
noise robustness; speaker verification; DNN; feature compensation; NEURAL-NETWORKS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The speaker verification (SV) task has been an active area of research in the last thirty years. One of the recent research topics is on improving the robustness of SV system in challenging environments. This paper examines the robustness of current state of the art SV system against background noise corruptions. Specifically, we consider the scenario where the SV system is trained from noise free speech and tested on background noise corrupted speech. To improve robustness of the system, a deep neural networks (DNN) based feature compensation is proposed to enhance the cepstral features before the evaluation. The DNN is trained from parallel data of clean and noise corrupted speech which are aligned in the frame level. The training is achieved by minimizing the mean square error (MSE) between the DNN's prediction and the target clean features. The trained network could predict the underlying clean features when given noisy features. Results on the benchmarking SRE 2010 female core task show that by using DNN based feature compensation, the equal error rate (EER) can be reduced in most of the times even when the test noise is unseen during DNN training. The relative EER reduction usually is in the range of 3% to 26%.
引用
收藏
页码:871 / 875
页数:5
相关论文
共 50 条
  • [31] Blind Stochastic Feature Transformation for Channel Robust Speaker Verification
    K.K. Yiu
    M. W. Mak
    M. C. Cheung
    S. Y. Kung
    Journal of VLSI signal processing systems for signal, image and video technology, 2006, 42 : 117 - 126
  • [32] Blind stochastic feature transformation for channel robust speaker verification
    Yiu, KK
    Mak, MW
    Cheung, MC
    Kung, SY
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2006, 42 (02): : 117 - 126
  • [33] Adversarial Network Bottleneck Features for Noise Robust Speaker Verification
    Yu, Hong
    Tan, Zheng-Hua
    Ma, Zhanyu
    Guo, Jun
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1492 - 1496
  • [34] BOOSTED BINARY FEATURES FOR NOISE-ROBUST SPEAKER VERIFICATION
    Roy, Anindya
    Magimai-Doss, Mathew
    Marcel, Sebastien
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4442 - 4445
  • [35] Noise robust automatic speaker verification systems: review and analysis
    Joshi, Sanil
    Dua, Mohit
    TELECOMMUNICATION SYSTEMS, 2024, 87 (03) : 845 - 886
  • [36] NONNEGATIVE MATRIX FACTORIZATION BASED NOISE ROBUST SPEAKER VERIFICATION
    Liu, S. H.
    Zou, Y. X.
    Ning, H. K.
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 35 - 39
  • [37] Noise robust speaker verification using Mel-Frequency Discrete Wavelet Coefficients and parallel model compensation
    Tufekci, Z
    Gurbuz, S
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 657 - 660
  • [38] RAPID JOINT SPEAKER AND NOISE COMPENSATION FOR ROBUST SPEECH RECOGNITION
    Chin, K. K.
    Xu, Haitian
    Gales, Mark J. F.
    Breslin, Catherine
    Knill, Kate
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5500 - 5503
  • [39] Shouted Speech Compensation for Speaker Verification Robust to Vocal Effort Conditions
    Prieto, Santi
    Ortega, Alfonso
    Lopez-Espejo, Ivan
    Lleida, Eduardo
    INTERSPEECH 2020, 2020, : 1511 - 1515
  • [40] Compensation of Intrinsic Variability with Factor Analysis Modeling for Robust Speaker Verification
    Chen, Sheng
    Xu, Mingxing
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1574 - 1577