An Adaptation Method in Noise Mismatch Conditions for DNN-based Speech Enhancement

被引:1
|
作者
Xu Si-Ying [1 ]
Niu Tong [1 ]
Qu Dan [1 ]
Long Xing-Yan [1 ]
机构
[1] Natl Digital Switching Syst Engn & Technol R&D Ct, Zhengzhou, Henan, Peoples R China
关键词
Noise-aware Training; identity-vector; L-2; regularization; speech enhancement; DNN; condition mismatch; INTELLIGIBILITY; RECOGNITION; SUPPRESSION; SELECTION; MODEL;
D O I
10.3837/tiis.2018.10.017
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The deep learning based speech enhancement has shown considerable success. However, it still suffers performance degradation under mismatch conditions. In this paper, an adaptation method is proposed to improve the performance under noise mismatch conditions. Firstly, we advise a noise aware training by supplying identity vectors (i-vectors) as parallel input features to adapt deep neural network (DNN) acoustic models with the target noise. Secondly, given a small amount of adaptation data, the noise-dependent DNN is obtained by using L-2 regularization from a noise-independent DNN, and forcing the estimated masks to be close to the unadapted condition. Finally, experiments were carried out on different noise and SNR conditions, and the proposed method has achieved significantly 0.1%-9.6% benefits of STOI, and provided consistent improvement in PESQ and segSNR against the baseline systems.
引用
收藏
页码:4930 / 4951
页数:22
相关论文
共 50 条
  • [21] JOINT NOISE AND MASK AWARE TRAINING FOR DNN-BASED SPEECH ENHANCEMENT WITH SUB-BAND FEATURES
    Wang, Qing
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    2017 HANDS-FREE SPEECH COMMUNICATIONS AND MICROPHONE ARRAYS (HSCMA 2017), 2017, : 101 - 105
  • [22] DNN-based monaural speech enhancement with temporal and spectral variations equalization
    Kang, Tae Gyoon
    Shin, Jong Won
    Kim, Nam Soo
    DIGITAL SIGNAL PROCESSING, 2018, 74 : 102 - 110
  • [23] DNN-based speech enhancement with self-attention on feature dimension
    Cheng, Jiaming
    Liang, Ruiyu
    Zhao, Li
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (43-44) : 32449 - 32470
  • [24] DNN-based speech enhancement with self-attention on feature dimension
    Jiaming Cheng
    Ruiyu Liang
    Li Zhao
    Multimedia Tools and Applications, 2020, 79 : 32449 - 32470
  • [25] Online Multichannel Speech Enhancement Based on Recursive EM and DNN-Based Speech Presence Estimation
    Martin-Donas, Juan Manuel
    Jensen, Jesper
    Tan, Zheng-Hua
    Gomez, Angel M.
    Peinado, Antonio M.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 3080 - 3094
  • [26] DNN-Based Arabic Speech Synthesis
    Amrouche, Aissa
    Bentrcia, Youssouf
    Boubakeur, Khadidja Nesrine
    Abed, Ahcene
    2022 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ICEEE 2022), 2022, : 378 - 382
  • [27] Speaker adaptation in DNN-based speech synthesis using d-vectors
    Doddipatla, Rama
    Braunschweiler, Norbert
    Maia, Ranniery
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3404 - 3408
  • [28] AN ANALYSIS OF NOISE-AWARE FEATURES IN COMBINATION WITH THE SIZE AND DIVERSITY OF TRAINING DATA FOR DNN-BASED SPEECH ENHANCEMENT
    Rehr, Robert
    Gerkmann, Timo
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 601 - 605
  • [29] Unsupervised Speaker Adaptation for DNN-based Speech Synthesis using Input Codes
    Takaki, Shinji
    Nishimura, Yoshikazu
    Yamagishi, Junichi
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 649 - 658
  • [30] A NEW COST FUNCTION FOR DNN-BASED SPEECH ENHANCEMENT COMBINING NMF AND CASA
    Yan, Bofang
    Bao, Changchun
    Bai, Zhigang
    PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 255 - 259