DNN FEATURE COMPENSATION FOR NOISE ROBUST SPEAKER VERIFICATION

被引:0
|
作者
Du, Steven [1 ,2 ]
Xiao, Xiong [2 ]
Chng, Eng Siong [1 ,2 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore
[2] Nanyang Technol Univ, Temasek Labs, Singapore, Singapore
关键词
noise robustness; speaker verification; DNN; feature compensation; NEURAL-NETWORKS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The speaker verification (SV) task has been an active area of research in the last thirty years. One of the recent research topics is on improving the robustness of SV system in challenging environments. This paper examines the robustness of current state of the art SV system against background noise corruptions. Specifically, we consider the scenario where the SV system is trained from noise free speech and tested on background noise corrupted speech. To improve robustness of the system, a deep neural networks (DNN) based feature compensation is proposed to enhance the cepstral features before the evaluation. The DNN is trained from parallel data of clean and noise corrupted speech which are aligned in the frame level. The training is achieved by minimizing the mean square error (MSE) between the DNN's prediction and the target clean features. The trained network could predict the underlying clean features when given noisy features. Results on the benchmarking SRE 2010 female core task show that by using DNN based feature compensation, the equal error rate (EER) can be reduced in most of the times even when the test noise is unseen during DNN training. The relative EER reduction usually is in the range of 3% to 26%.
引用
收藏
页码:871 / 875
页数:5
相关论文
共 50 条
  • [1] Psychoacoustic Model Compensation with Robust Feature Set for Speaker Verification in Additive Noise
    Panda, Ashish
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 629 - 632
  • [2] Feature recovery for noise-robust speaker verification
    Huang, Houjun
    Xu, Yunfei
    Zhou, Ruohua
    Yan, Yonghong
    ELECTRONICS LETTERS, 2015, 51 (18) : 1459 - 1461
  • [3] Psychoacoustic Model Compensation for Robust Speaker Verification in Environmental Noise
    Panda, Ashish
    Srikanthan, Thambipillai
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (03): : 945 - 953
  • [4] I-Vector DNN Scoring and Calibration for Noise Robust Speaker Verification
    Tan, Zhili
    Mak, Man-Wai
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1562 - 1566
  • [5] Investigation of DNN based Feature Enhancement Jointly Trained with X-Vectors for Noise-Robust Speaker Verification
    Yang, Joon-Young
    Park, Kwan-Ho
    Chang, Joon-Hyuk
    Kim, Youngsam
    Cho, Sangrae
    2020 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2020,
  • [6] DNN-Based Score Calibration With Multitask Learning for Noise Robust Speaker Verification
    Tan, Zhili
    Mak, Man-Wai
    Mak, Brian Kan-Wing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (04) : 700 - 712
  • [7] DNN-based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification
    Oo, Zeyan
    Kawakami, Yuta
    Wang, Longbiao
    Nakagawa, Seiichi
    Xiao, Xiong
    Iwahashi, Masahiro
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2204 - 2208
  • [8] Pitch synchronous based feature extraction for noise-robust speaker verification
    Gong Wei-Guo
    Yang Li-Ping
    Chen Di
    CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 5, PROCEEDINGS, 2008, : 295 - 298
  • [9] DNN-Driven Mixture of PLDA for Robust Speaker Verification
    Li, Na
    Mak, Man-Wai
    Chien, Jen-Tzung
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (06) : 1371 - 1383
  • [10] Mismatch modeling and compensation for robust speaker verification
    Lei, Yun
    Hansen, John H. L.
    SPEECH COMMUNICATION, 2011, 53 (02) : 257 - 268