An Adaptation Method in Noise Mismatch Conditions for DNN-based Speech Enhancement

被引:1
|
作者
Xu Si-Ying [1 ]
Niu Tong [1 ]
Qu Dan [1 ]
Long Xing-Yan [1 ]
机构
[1] Natl Digital Switching Syst Engn & Technol R&D Ct, Zhengzhou, Henan, Peoples R China
关键词
Noise-aware Training; identity-vector; L-2; regularization; speech enhancement; DNN; condition mismatch; INTELLIGIBILITY; RECOGNITION; SUPPRESSION; SELECTION; MODEL;
D O I
10.3837/tiis.2018.10.017
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The deep learning based speech enhancement has shown considerable success. However, it still suffers performance degradation under mismatch conditions. In this paper, an adaptation method is proposed to improve the performance under noise mismatch conditions. Firstly, we advise a noise aware training by supplying identity vectors (i-vectors) as parallel input features to adapt deep neural network (DNN) acoustic models with the target noise. Secondly, given a small amount of adaptation data, the noise-dependent DNN is obtained by using L-2 regularization from a noise-independent DNN, and forcing the estimated masks to be close to the unadapted condition. Finally, experiments were carried out on different noise and SNR conditions, and the proposed method has achieved significantly 0.1%-9.6% benefits of STOI, and provided consistent improvement in PESQ and segSNR against the baseline systems.
引用
收藏
页码:4930 / 4951
页数:22
相关论文
共 50 条
  • [31] Power Exponent Based Weighting Criterion for DNN-Based Mask Approximation in Speech Enhancement
    Cui, Zihao
    Bao, Changchun
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 618 - 622
  • [32] ILMSAF based speech enhancement with DNN and noise classification
    Li, Ruwei
    Liu, Yanan
    Shi, Yongqiang
    Dong, Liang
    Cui, Weili
    SPEECH COMMUNICATION, 2016, 85 : 53 - 70
  • [33] INVERTIBLE DNN-BASED NONLINEAR TIME-FREQUENCY TRANSFORM FOR SPEECH ENHANCEMENT
    Lakeuchi, Daiki
    Yatabe, Kohei
    Koizumi, Yuma
    Oikawa, Yasuhiro
    Harada, Noboru
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6644 - 6648
  • [34] DNN-BASED DISTRIBUTED MULTICHANNEL MASK ESTIMATION FOR SPEECH ENHANCEMENT IN MICROPHONE ARRAYS
    Furnon, Nicolas
    Serizel, Romain
    Illina, Irina
    Essid, Slim
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4672 - 4676
  • [35] DNN-BASED SPEECH RECOGNITION FOR GLOBALPHONE LANGUAGES
    Tachbelie, Martha Yifiru
    Abulimiti, Ayimunishagu
    Abate, Solomon Teferra
    Schultz, Tanja
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8269 - 8273
  • [36] DNN-based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification
    Oo, Zeyan
    Kawakami, Yuta
    Wang, Longbiao
    Nakagawa, Seiichi
    Xiao, Xiong
    Iwahashi, Masahiro
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2204 - 2208
  • [37] Experimental Evaluation of Speech Enhancement for In-Car Environment Using Blind Source Separation and DNN-based Noise Suppression
    Takeuchi, Yutsuki
    Nakashima, Taishi
    Ono, Nobutaka
    Takazawa, Takashi
    Shimanoe, Shuhei
    Tsuchiya, Yoshinori
    2024 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2024,
  • [38] SNR-Based Features and Diverse Training Data for Robust DNN-Based Speech Enhancement
    Rehr, Robert
    Gerkmann, Timo
    IEEE/ACM Transactions on Audio Speech and Language Processing, 2021, 29 : 1937 - 1949
  • [39] SNR-Based Features and Diverse Training Data for Robust DNN-Based Speech Enhancement
    Rehr, Robert
    Gerkmann, Timo
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1937 - 1949
  • [40] Towards More Efficient DNN-Based Speech Enhancement Using Quantized Correlation Mask
    Abdullah, Salinna
    Zamani, Majid
    Demosthenous, Andreas
    IEEE ACCESS, 2021, 9 : 24350 - 24362