Investigation of DNN based Feature Enhancement Jointly Trained with X-Vectors for Noise-Robust Speaker Verification

被引:0
|
作者
Yang, Joon-Young [1 ]
Park, Kwan-Ho [1 ]
Chang, Joon-Hyuk [1 ]
Kim, Youngsam [2 ]
Cho, Sangrae [2 ]
机构
[1] Hanyang Univ, Dept Elect & Comp Engn, Seoul, South Korea
[2] Elect & Telecommun Res Inst, Informat Secur Res Div, Daejeon, South Korea
关键词
speaker verification; deep speaker embedding; feature enhancement; joint training;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we investigate the deep neural network (DNN) based feature enhancement as the denoising frontend of the x-vector speaker verification framework in noisy environments. Firstly, the feature enhancement DNN (FE-DNN) learns the mapping function from the noisy to the clean corpora on the frame-level acoustic feature domain, and then the x-vector network (XvectorNet) is trained on top of the enhanced features. Finally, the separately trained FE-DNN and the XvectorNet are serially concatenated and jointly trained under the supervision of cross-entropy loss. In addition, we adopt the logistic margin softmax layer for training the XvectorNet in order to obtain more discriminative speaker embeddings.
引用
收藏
页数:5
相关论文
共 17 条
  • [1] X-VECTORS: ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION
    Snyder, David
    Garcia-Romero, Daniel
    Sell, Gregory
    Povey, Daniel
    Khudanpur, Sanjeev
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5329 - 5333
  • [2] Feature recovery for noise-robust speaker verification
    Huang, Houjun
    Xu, Yunfei
    Zhou, Ruohua
    Yan, Yonghong
    ELECTRONICS LETTERS, 2015, 51 (18) : 1459 - 1461
  • [3] DNN FEATURE COMPENSATION FOR NOISE ROBUST SPEAKER VERIFICATION
    Du, Steven
    Xiao, Xiong
    Chng, Eng Siong
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 871 - 875
  • [4] Pitch synchronous based feature extraction for noise-robust speaker verification
    Gong Wei-Guo
    Yang Li-Ping
    Chen Di
    CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 5, PROCEEDINGS, 2008, : 295 - 298
  • [5] DNN-based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification
    Oo, Zeyan
    Kawakami, Yuta
    Wang, Longbiao
    Nakagawa, Seiichi
    Xiao, Xiong
    Iwahashi, Masahiro
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2204 - 2208
  • [6] Joint optimization of neural acoustic beamforming and dereverberation with x-vectors for robust speaker verification
    Yang, Joon-Young
    Chang, Joon-Hyuk
    INTERSPEECH 2019, 2019, : 4075 - 4079
  • [7] Noise-robust feature based on sparse representation for speaker recognition
    Qi, Hongzhuo
    Metallurgical and Mining Industry, 2015, 7 (04): : 64 - 69
  • [8] Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification
    Michelsanti, Daniel
    Tan, Zheng-Hua
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2008 - 2012
  • [9] Weighted X-Vectors for Robust Text-Independent Speaker Verification with Multiple Enrollment Utterances
    Mohsen Mohammadi
    Hamid Reza Sadegh Mohammadi
    Circuits, Systems, and Signal Processing, 2022, 41 : 2825 - 2844
  • [10] Weighted X-Vectors for Robust Text-Independent Speaker Verification with Multiple Enrollment Utterances
    Mohammadi, Mohsen
    Mohammadi, Hamid Reza Sadegh
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (05) : 2825 - 2844