An Adaptation Method in Noise Mismatch Conditions for DNN-based Speech Enhancement

被引:1
|
作者
Xu Si-Ying [1 ]
Niu Tong [1 ]
Qu Dan [1 ]
Long Xing-Yan [1 ]
机构
[1] Natl Digital Switching Syst Engn & Technol R&D Ct, Zhengzhou, Henan, Peoples R China
关键词
Noise-aware Training; identity-vector; L-2; regularization; speech enhancement; DNN; condition mismatch; INTELLIGIBILITY; RECOGNITION; SUPPRESSION; SELECTION; MODEL;
D O I
10.3837/tiis.2018.10.017
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The deep learning based speech enhancement has shown considerable success. However, it still suffers performance degradation under mismatch conditions. In this paper, an adaptation method is proposed to improve the performance under noise mismatch conditions. Firstly, we advise a noise aware training by supplying identity vectors (i-vectors) as parallel input features to adapt deep neural network (DNN) acoustic models with the target noise. Secondly, given a small amount of adaptation data, the noise-dependent DNN is obtained by using L-2 regularization from a noise-independent DNN, and forcing the estimated masks to be close to the unadapted condition. Finally, experiments were carried out on different noise and SNR conditions, and the proposed method has achieved significantly 0.1%-9.6% benefits of STOI, and provided consistent improvement in PESQ and segSNR against the baseline systems.
引用
收藏
页码:4930 / 4951
页数:22
相关论文
共 50 条
  • [41] DNN-Based Mask Estimation for Distributed Speech Enhancement in Spatially Unconstrained Microphone Arrays
    Furnon, Nicolas
    Serizel, Romain
    Essid, Slim
    Illina, Irina
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2310 - 2323
  • [42] ON THE IMPACT OF FREQUENCY RESOLUTION ON FEMALE AND MALE SPEECH IN DNN-BASED NOISE REDUCTION SYSTEMS
    Oberhag, Maurice
    Zeng, Yan
    Martin, Rainer
    2024 18TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT, IWAENC 2024, 2024, : 70 - 74
  • [43] A low-computational DNN-based speech enhancement for hearing aids based on element selection
    Haruta, Chiho
    Ono, Nobutaka
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1025 - 1029
  • [44] Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech
    Fujimura, Takuya
    Koizumi, Yuma
    Yatabe, Kohei
    Miyazaki, Ryoichi
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 436 - 440
  • [45] DNN-Based Low-Musical-Noise Single-Channel Speech Enhancement Based on Higher-Order-Moments Matching
    Mizoguchi, Satoshi
    Saito, Yuki
    Takamichi, Shinnosuke
    Saruwatari, Hiroshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (11) : 1971 - 1980
  • [46] DNN-Based Semantic Rescoring Models for Speech Recognition
    Illina, Irina
    Fohr, Dominique
    TEXT, SPEECH, AND DIALOGUE, TSD 2021, 2021, 12848 : 357 - 370
  • [47] Unsupervised Domain Adaptation for DNN-based Automated Harvesting
    Shkanaev, Aleksandr Yu
    Sholomov, Dmitry L.
    Nikolaev, Dmitry P.
    TWELFTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2019), 2020, 11433
  • [48] UNSUPERVISED SPEAKER ADAPTATION FOR DNN-BASED TTS SYNTHESIS
    Fan, Yuchen
    Qian, Yao
    Soong, Frank K.
    He, Lei
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5135 - 5139
  • [49] Prediction of speech intelligibility with DNN-based performance measures
    Martinez, Angel Mario Castro
    Spille, Constantin
    Rossbach, Jana
    Kollmeier, Birger
    Meyer, Bernd T.
    COMPUTER SPEECH AND LANGUAGE, 2022, 74
  • [50] DNN-Based Speech Synthesis Using Speaker Codes
    Hojo, Nobukatsu
    Ijima, Yusuke
    Mizuno, Hideyuki
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (02): : 462 - 472