Time-Reversal Enhancement Network With Cross-Domain Information for Noise-Robust Speech Recognition

被引:1
|
作者
Chao, Fu-An [1 ]
Hung, Jeih-Weih [3 ]
Sheu, Tommy [4 ]
Chen, Berlin [2 ]
机构
[1] Natl Taiwan Normal Univ, Taipei 11677, Taiwan
[2] Natl Taiwan Normal Univ, Comp Sci & Informat Engn Dept, Taipei 11677, Taiwan
[3] Natl Chi Nan Univ, Dept Elect Engn, Puli 54516, Taiwan
[4] Delta Elect Inc, Delta Management Syst DMS Dept, Taipei 11491, Taiwan
关键词
Feature extraction; Convolutional neural networks; Spectrogram; Noise measurement; Speech enhancement; Estimation; Time-domain analysis;
D O I
10.1109/MMUL.2021.3139302
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the enormous progress in deep learning, speech enhancement (SE) techniques have shown promising efficacy and play a pivotal role prior to an automatic speech recognition (ASR) system to mitigate the noise effects. In this article, we put forward a novel cross-domain time-reversal enhancement network (CD-TENET). CD-TENET leverages the time-reversed version of a speech signal and two effective features that consider the phase information of a speech signal in the time domain and the frequency domain, respectively, to promote SE performance for noise-robust ASR. Extensive experiments demonstrate that CD-TENET can not only recover the original speech effectively but also improve both SE and ASR performance simultaneously. More surprisingly, the proposed CD-TENET method can offer a marked relative word error rate reduction on test utterances of scenarios contaminated with unseen noises when compared to a strong baseline with the multicondition training setting.
引用
收藏
页码:114 / 124
页数:11
相关论文
共 50 条
  • [31] Investigating Cross-Domain Losses for Speech Enhancement
    Abdulatif, Sherif
    Armanious, Karim
    Sajeev, Jayasankar T.
    Guirguis, Karim
    Yang, Bin
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 411 - 415
  • [32] Noise-robust speech recognition based on difference of power spectrum
    Xu, JF
    Wei, G
    ELECTRONICS LETTERS, 2000, 36 (14) : 1247 - 1248
  • [33] Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition
    Shimada, Kazuki
    Bando, Yoshiaki
    Mimura, Masato
    Itoyama, Katsutoshi
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (05) : 960 - 971
  • [34] On the temporal decorrelation of feature parameters for noise-robust speech recognition
    Jung, HY
    Lee, SY
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (04): : 407 - 416
  • [35] Deep Maxout Networks Applied to Noise-Robust Speech Recognition
    de-la-Calle-Silos, F.
    Gallardo-Antolin, A.
    Pelaez-Moreno, C.
    ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2014, 2014, 8854 : 109 - 118
  • [36] MULTI-TASK AUTOENCODER FOR NOISE-ROBUST SPEECH RECOGNITION
    Zhang, Haoyi
    Liu, Conggui
    Inoue, Nakamasa
    Shinoda, Koichi
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5599 - 5603
  • [37] An Efficient and Noise-Robust Audiovisual Encoder for Audiovisual Speech Recognition
    Li, Zhengyang
    Liang, Chenwei
    Lohrenz, Timo
    Sach, Marvin
    Moeller, Bjoern
    Fingscheidt, Tim
    INTERSPEECH 2023, 2023, : 1583 - 1587
  • [38] Empirical Mode Decomposition For Noise-Robust Automatic Speech Recognition
    Wu, Kuo-Hao
    Chen, Chia-Ping
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2074 - 2077
  • [39] INCORPORATING MASK MODELLING FOR NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION
    Koekueer, Muenevver
    Jancovic, Peter
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3929 - 3932
  • [40] Unsupervised modulation filter learning for noise-robust speech recognition
    Agrawal, Purvi
    Ganapathy, Sriram
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 142 (03): : 1686 - 1692