An Adaptation Method in Noise Mismatch Conditions for DNN-based Speech Enhancement

被引：1

作者：

Xu Si-Ying ^{[1
]}

Niu Tong ^{[1
]}

Qu Dan ^{[1
]}

Long Xing-Yan ^{[1
]}

机构：

[1] Natl Digital Switching Syst Engn & Technol R&D Ct, Zhengzhou, Henan, Peoples R China

来源：

KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS | 2018年 / 12卷 / 10期

关键词：

Noise-aware Training; identity-vector; L-2; regularization; speech enhancement; DNN; condition mismatch; INTELLIGIBILITY; RECOGNITION; SUPPRESSION; SELECTION; MODEL;

D O I：

10.3837/tiis.2018.10.017

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The deep learning based speech enhancement has shown considerable success. However, it still suffers performance degradation under mismatch conditions. In this paper, an adaptation method is proposed to improve the performance under noise mismatch conditions. Firstly, we advise a noise aware training by supplying identity vectors (i-vectors) as parallel input features to adapt deep neural network (DNN) acoustic models with the target noise. Secondly, given a small amount of adaptation data, the noise-dependent DNN is obtained by using L-2 regularization from a noise-independent DNN, and forcing the estimated masks to be close to the unadapted condition. Finally, experiments were carried out on different noise and SNR conditions, and the proposed method has achieved significantly 0.1%-9.6% benefits of STOI, and provided consistent improvement in PESQ and segSNR against the baseline systems.

引用

页码：4930 / 4951

页数：22

共 50 条

[21] JOINT NOISE AND MASK AWARE TRAINING FOR DNN-BASED SPEECH ENHANCEMENT WITH SUB-BAND FEATURES
Wang, Qing
Du, Jun
Dai, Li-Rong
Lee, Chin-Hui
2017 HANDS-FREE SPEECH COMMUNICATIONS AND MICROPHONE ARRAYS (HSCMA 2017), 2017, : 101 - 105
[22] DNN-based monaural speech enhancement with temporal and spectral variations equalization
Kang, Tae Gyoon
Shin, Jong Won
Kim, Nam Soo
DIGITAL SIGNAL PROCESSING, 2018, 74 : 102 - 110
[23] DNN-based speech enhancement with self-attention on feature dimension
Cheng, Jiaming
Liang, Ruiyu
Zhao, Li
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (43-44) : 32449 - 32470
[24] DNN-based speech enhancement with self-attention on feature dimension
Jiaming Cheng
Ruiyu Liang
Li Zhao
Multimedia Tools and Applications, 2020, 79 : 32449 - 32470
[25] Online Multichannel Speech Enhancement Based on Recursive EM and DNN-Based Speech Presence Estimation
Martin-Donas, Juan Manuel
Jensen, Jesper
Tan, Zheng-Hua
Gomez, Angel M.
Peinado, Antonio M.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 3080 - 3094
[26] DNN-Based Arabic Speech Synthesis
Amrouche, Aissa
Bentrcia, Youssouf
Boubakeur, Khadidja Nesrine
Abed, Ahcene
2022 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ICEEE 2022), 2022, : 378 - 382
[27] Speaker adaptation in DNN-based speech synthesis using d-vectors
Doddipatla, Rama
Braunschweiler, Norbert
Maia, Ranniery
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3404 - 3408
[28] AN ANALYSIS OF NOISE-AWARE FEATURES IN COMBINATION WITH THE SIZE AND DIVERSITY OF TRAINING DATA FOR DNN-BASED SPEECH ENHANCEMENT
Rehr, Robert
Gerkmann, Timo
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 601 - 605
[29] Unsupervised Speaker Adaptation for DNN-based Speech Synthesis using Input Codes
Takaki, Shinji
Nishimura, Yoshikazu
Yamagishi, Junichi
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 649 - 658
[30] A NEW COST FUNCTION FOR DNN-BASED SPEECH ENHANCEMENT COMBINING NMF AND CASA
Yan, Bofang
Bao, Changchun
Bai, Zhigang
PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 255 - 259

← 1 2 3 4 5 →