An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition

被引:4
|
作者
Raj, Bhiksha [1 ]
Turicchia, Lorenzo [2 ]
Schmidt-Nielsen, Bent [1 ]
Sarpeshkar, Rahul [2 ]
机构
[1] MERL, Cambridge, MA 02139 USA
[2] MIT, Cambridge, MA 02139 USA
关键词
Error Rate; Acoustics; Recognition Task; Recognition Performance; Auditory System;
D O I
10.1155/2007/65420
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We describe an FFT-based companding algorithm for preprocessing speech before recognition. The algorithm mimics tone-to-tone suppression and masking in the auditory system to improve automatic speech recognition performance in noise. Moreover, it is also very computationally efficient and suited to digital implementations due to its use of the FFT. In an automotive digits recognition task with the CU-Move database recorded in real environmental noise, the algorithm improves the relative word error by 12.5% at -5 dB signal-to-noise ratio (SNR) and by 6.2% across all SNRs (-5 dB SNR to +15 dB SNR). In the Aurora-2 database recorded with artificially added noise in several environments, the algorithm improves the relative word error rate in almost all situations. Copyright (C) 2007 Bhiksha Raj et al.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Frame decorrelation for noise-robust speech recognition
    Jung, HY
    Kim, DY
    Un, CK
    ELECTRONICS LETTERS, 1996, 32 (13) : 1163 - 1164
  • [32] Frame decorrelation for noise-robust speech recognition
    Korea Advanced Inst of Science and, Technology, Taejon, Korea, Republic of
    Electron Lett, 13 (1163-1164):
  • [33] Extended VTS for Noise-Robust Speech Recognition
    van Dalen, Rogier C.
    Gales, Mark J. F.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 733 - 743
  • [34] Orthogonalized distinctive phonetic feature extraction for noise-robust automatic speech recognition
    Fukuda, T
    Nitta, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (05): : 1110 - 1118
  • [35] Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition
    Shimada, Kazuki
    Bando, Yoshiaki
    Mimura, Masato
    Itoyama, Katsutoshi
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (05) : 960 - 971
  • [36] Investigation of Speech Separation as a Front-End for Noise Robust Speech Recognition
    Narayanan, Arun
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 826 - 835
  • [37] Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
    Hu, Yuchen
    Hou, Nana
    Chen, Chen
    Chng, Eng Siong
    INTERSPEECH 2023, 2023, : 2918 - 2922
  • [38] MODELLING SPECTRO-TEMPORAL DYNAMICS IN FACTORISATION-BASED NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION
    Hurmalainen, Antti
    Virtanen, Tuomas
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4113 - 4116
  • [39] A Noise-Robust Speech Recognition System Based on Wavelet Neural Network
    Wang, Yiping
    Zhao, Zhefeng
    ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT III, 2011, 7004 : 392 - 397
  • [40] EXPLOITING SYNCHRONY SPECTRA AND DEEP NEURAL NETWORKS FOR NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION
    Ma, Ning
    Marxer, Ricard
    Barker, Jon
    Brown, Guy J.
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 490 - 495