An Adaptation Method in Noise Mismatch Conditions for DNN-based Speech Enhancement

被引：1

作者：

Xu Si-Ying ^{[1
]}

Niu Tong ^{[1
]}

Qu Dan ^{[1
]}

Long Xing-Yan ^{[1
]}

机构：

[1] Natl Digital Switching Syst Engn & Technol R&D Ct, Zhengzhou, Henan, Peoples R China

来源：

KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS | 2018年 / 12卷 / 10期

关键词：

Noise-aware Training; identity-vector; L-2; regularization; speech enhancement; DNN; condition mismatch; INTELLIGIBILITY; RECOGNITION; SUPPRESSION; SELECTION; MODEL;

D O I：

10.3837/tiis.2018.10.017

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The deep learning based speech enhancement has shown considerable success. However, it still suffers performance degradation under mismatch conditions. In this paper, an adaptation method is proposed to improve the performance under noise mismatch conditions. Firstly, we advise a noise aware training by supplying identity vectors (i-vectors) as parallel input features to adapt deep neural network (DNN) acoustic models with the target noise. Secondly, given a small amount of adaptation data, the noise-dependent DNN is obtained by using L-2 regularization from a noise-independent DNN, and forcing the estimated masks to be close to the unadapted condition. Finally, experiments were carried out on different noise and SNR conditions, and the proposed method has achieved significantly 0.1%-9.6% benefits of STOI, and provided consistent improvement in PESQ and segSNR against the baseline systems.

引用

页码：4930 / 4951

页数：22

共 50 条

[31] Power Exponent Based Weighting Criterion for DNN-Based Mask Approximation in Speech Enhancement
Cui, Zihao
Bao, Changchun
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 618 - 622
[32] ILMSAF based speech enhancement with DNN and noise classification
Li, Ruwei
Liu, Yanan
Shi, Yongqiang
Dong, Liang
Cui, Weili
SPEECH COMMUNICATION, 2016, 85 : 53 - 70
[33] INVERTIBLE DNN-BASED NONLINEAR TIME-FREQUENCY TRANSFORM FOR SPEECH ENHANCEMENT
Lakeuchi, Daiki
Yatabe, Kohei
Koizumi, Yuma
Oikawa, Yasuhiro
Harada, Noboru
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6644 - 6648
[34] DNN-BASED DISTRIBUTED MULTICHANNEL MASK ESTIMATION FOR SPEECH ENHANCEMENT IN MICROPHONE ARRAYS
Furnon, Nicolas
Serizel, Romain
Illina, Irina
Essid, Slim
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4672 - 4676
[35] DNN-BASED SPEECH RECOGNITION FOR GLOBALPHONE LANGUAGES
Tachbelie, Martha Yifiru
Abulimiti, Ayimunishagu
Abate, Solomon Teferra
Schultz, Tanja
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8269 - 8273
[36] DNN-based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification
Oo, Zeyan
Kawakami, Yuta
Wang, Longbiao
Nakagawa, Seiichi
Xiao, Xiong
Iwahashi, Masahiro
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2204 - 2208
[37] Experimental Evaluation of Speech Enhancement for In-Car Environment Using Blind Source Separation and DNN-based Noise Suppression
Takeuchi, Yutsuki
Nakashima, Taishi
Ono, Nobutaka
Takazawa, Takashi
Shimanoe, Shuhei
Tsuchiya, Yoshinori
2024 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2024,
[38] SNR-Based Features and Diverse Training Data for Robust DNN-Based Speech Enhancement
Rehr, Robert
Gerkmann, Timo
IEEE/ACM Transactions on Audio Speech and Language Processing, 2021, 29 : 1937 - 1949
[39] SNR-Based Features and Diverse Training Data for Robust DNN-Based Speech Enhancement
Rehr, Robert
Gerkmann, Timo
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1937 - 1949
[40] Towards More Efficient DNN-Based Speech Enhancement Using Quantized Correlation Mask
Abdullah, Salinna
Zamani, Majid
Demosthenous, Andreas
IEEE ACCESS, 2021, 9 : 24350 - 24362

← 1 2 3 4 5 →