Unseen Noise Estimation Using Separable Deep Auto Encoder for Speech Enhancement

被引:62
|
作者
Sun, Meng [1 ]
Zhang, Xiongwei [1 ]
Van hamme, Hugo [2 ]
Zheng, Thomas Fang [3 ]
机构
[1] PLA Univ Sci & Technol, Lab Intelligent Informat Proc, Nanjing 210007, Jiangsu, Peoples R China
[2] Katholieke Univ Leuven, Elect Engn Dept ESAT, Speech Proc Res Grp, B-3000 Louvain, Belgium
[3] Tsinghua Univ, Res Inst Informat Technol, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep auto encoder; source separation; speech enhancement; unseen noise compensation; HMM;
D O I
10.1109/TASLP.2015.2498101
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Unseen noise estimation is a key yet challenging step to make a speech enhancement algorithm work in adverse environments. At worst, the only prior knowledge we know about the encountered noise is that it is different from the involved speech. Therefore, by subtracting the components which cannot be adequately represented by a well defined speech model, the noises can be estimated and removed. Given the good performance of deep learning in signal representation, a deep auto encoder (DAE) is employed in this work for accurately modeling the clean speech spectrum. In the subsequent stage of speech enhancement, an extra DAE is introduced to represent the residual part obtained by subtracting the estimated clean speech spectrum (by using the pre-trained DAE) from the noisy speech spectrum. By adjusting the estimated clean speech spectrum and the unknown parameters of the noise DAE, one can reach a stationary point to minimize the total reconstruction error of the noisy speech spectrum. The enhanced speech signal is thus obtained by transforming the estimated clean speech spectrum back into time domain. The above proposed technique is called separable deep auto encoder (SDAE). Given the under-determined nature of the above optimization problem, the clean speech reconstruction is confined in the convex hull spanned by a pre-trained speech dictionary. New learning algorithms are investigated to respect the non-negativity of the parameters in the SDAE. Experimental results on TIMIT with 20 noise types at various noise levels demonstrate the superiority of the proposed method over the conventional baselines.
引用
收藏
页码:93 / 104
页数:12
相关论文
共 50 条
  • [21] Noise Filtering Mobile Application for Speech Enhancement using a Redundant Convolutional Encoder-Decoder
    Sampedro, Gabriel Avelino
    Kim, Ryanne Gail C.
    Aruan, Yohana Jayanti
    Kim, Dong-Seong
    Lee, Jae-Min
    2021 1ST INTERNATIONAL CONFERENCE IN INFORMATION AND COMPUTING RESEARCH (ICORE 2021), 2021, : 34 - 38
  • [22] Subband noise estimation for speech enhancement using a perceptual Wiener filter
    Lin, L
    Holmes, WH
    Ambikairajah, E
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 80 - 83
  • [23] Noise estimation based on entropy without using VAD for speech enhancement
    Ravi Teja, B.
    Bhavani, S.
    International Journal of Signal Processing, Image Processing and Pattern Recognition, 2014, 7 (02) : 355 - 364
  • [24] A CROSS-TASK TRANSFER LEARNING APPROACH TO ADAPTING DEEP SPEECH ENHANCEMENT MODELS TO UNSEEN BACKGROUND NOISE USING PAIRED SENONE CLASSIFIERS
    Wang, Sicheng
    Li, Wei
    Siniscalchi, Sabato Marco
    Lee, Chin-Hui
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6219 - 6223
  • [25] NOISE ESTIMATION WITH LOW COMPLEXITY FOR SPEECH ENHANCEMENT
    Yong, Pei Chee
    Nordholm, Sven
    Dam, Hai Huyen
    2011 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2011, : 109 - 112
  • [26] Adaptive noise estimation algorithm for speech enhancement
    Lin, L
    Holmes, WL
    Ambikairajah, E
    ELECTRONICS LETTERS, 2003, 39 (09) : 754 - 755
  • [27] Low-dimensional representation of spectral envelope using deep auto-encoder for speech synthesis
    Choi, Heejin
    Kim, Jaeseok
    Park, Jinuk
    Kim, Juntae
    Hahn, Minsoo
    ICMSCE 2018: PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON MECHATRONICS SYSTEMS AND CONTROL ENGINEERING, 2015, : 107 - 111
  • [28] Deep Learning-based Speech Presence Probability Estimation for Noise PSD Estimation in Single-channel Speech Enhancement
    Yang, Haemin
    Choe, Soyeon
    Kim, Keulbit
    Kang, Hong-Goo
    2018 INTERNATIONAL CONFERENCE ON SIGNALS AND SYSTEMS (ICSIGSYS), 2018, : 267 - 270
  • [29] Noise Estimation and Suppression Using Nonlinear Function with A Priori Speech Absence Probability in Speech Enhancement
    Lee, Soojeong
    Lee, Gangseong
    JOURNAL OF SENSORS, 2016, 2016
  • [30] New Generalized Sidelobe Canceller with Denoising Auto-Encoder for Improved Speech Enhancement
    Shin, Minkyu
    Mun, Seongkyu
    Han, David K.
    Ko, Hanseok
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2017, E100A (12): : 3038 - 3040