A systematic study of DNN based speech enhancement in reverberant and reverberant-noisy environments

被引:0
|
作者
Wang, Heming [1 ]
Pandey, Ashutosh [1 ]
Wang, Deliang [2 ]
机构
[1] Ohio State Univ, 281 Lane Ave, Columbus, OH 43210 USA
[2] Ctr Cognit & Brain Sci, 1835 Neil Ave, Columbus, OH 43210 USA
来源
关键词
Speech enhancement; Speech dereverberation; Self-attention; ARN; DC-CRN; NEURAL-NETWORK; DEREVERBERATION; IDENTIFICATION; RECOGNITION;
D O I
10.1016/j.csl.2024.101677
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning has led to dramatic performance improvements for the task of speech enhancement, where deep neural networks (DNNs) are trained to recover clean speech from noisy and reverberant mixtures. Most of the existing DNN-based algorithms operate in the frequency domain, as time -domain approaches are believed to be less effective for speech dereverberation. In this study, we employ two DNNs: ARN (attentive recurrent network) and DC-CRN (densely -connected convolutional recurrent network), and systematically investigate the effects of different components on enhancement performance, such as window sizes, loss functions, and feature representations. We conduct evaluation experiments in two main conditions: reverberant -only and reverberant -noisy. Our findings suggest that incorporating larger window sizes is helpful for dereverberation, and adding transform operations (either convolutional or linear) to encode and decode waveform features improves the sparsity of the learned representations, and boosts the performance of time -domain models. Experimental results demonstrate that ARN and DC-CRN with proposed techniques achieve superior performance compared with other strong enhancement baselines.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] DNN-BASED ENHANCEMENT OF NOISY AND REVERBERANT SPEECH
    Zhao, Yan
    Wang, DeLiang
    Merks, Ivo
    Zhang, Tao
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6525 - 6529
  • [2] Enhancement of Reverberant Speech in Noisy Acoustical Environments
    Joorabchi, Marjan
    Ghorshi, Seyed
    Sarafnia, Ali
    2014 SIXTH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2014,
  • [3] SPATIAL DIFFUSENESS FEATURES FOR DNN-BASED SPEECH RECOGNITION IN NOISY AND REVERBERANT ENVIRONMENTS
    Schwarz, Andreas
    Huemmer, Christian
    Maas, Roland
    Kellermann, Walter
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4380 - 4384
  • [4] SESNet: A Speech Enhancement and Separation Network in Noisy Reverberant Environments
    Wang, Liusong
    Gao, Yuan
    Cao, Kaimin
    Hu, Ying
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 44 - 54
  • [5] Speech intelligibility improvement in noisy reverberant environments based on speech enhancement and inverse filtering
    Dong, Huan-Yu
    Lee, Chang-Myung
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2018,
  • [6] Speech intelligibility improvement in noisy reverberant environments based on speech enhancement and inverse filtering
    Huan-Yu Dong
    Chang-Myung Lee
    EURASIP Journal on Audio, Speech, and Music Processing, 2018
  • [7] Speech Emotion Recognition in Noisy and Reverberant Environments
    Heracleous, Panikos
    Yasuda, Keiji
    Sugaya, Fumiaki
    Yoneyama, Akio
    Hashimoto, Masayuki
    2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 262 - 266
  • [8] Speech enhancement of instantaneous amplitude and phase for applications in noisy reverberant environments
    Liu, Yang
    Nower, Naushin
    Morita, Shota
    Unoki, Masashi
    SPEECH COMMUNICATION, 2016, 84 : 1 - 14
  • [9] A PROGRESSIVE ENHANCEMENT METHOD FOR NOISY AND REVERBERANT SPEECH
    Shu, Xiaofeng
    Zhou, Yi
    Cao, Yin
    2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
  • [10] Speech Intelligibility Enhancement in Noisy Reverberant Conditions
    Li, Junfeng
    Xia, Risheng
    Fang, Qiang
    Li, Aijun
    Yan, Yonghong
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,