An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition

被引:4
|
作者
Raj, Bhiksha [1 ]
Turicchia, Lorenzo [2 ]
Schmidt-Nielsen, Bent [1 ]
Sarpeshkar, Rahul [2 ]
机构
[1] MERL, Cambridge, MA 02139 USA
[2] MIT, Cambridge, MA 02139 USA
关键词
Error Rate; Acoustics; Recognition Task; Recognition Performance; Auditory System;
D O I
10.1155/2007/65420
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We describe an FFT-based companding algorithm for preprocessing speech before recognition. The algorithm mimics tone-to-tone suppression and masking in the auditory system to improve automatic speech recognition performance in noise. Moreover, it is also very computationally efficient and suited to digital implementations due to its use of the FFT. In an automotive digits recognition task with the CU-Move database recorded in real environmental noise, the algorithm improves the relative word error by 12.5% at -5 dB signal-to-noise ratio (SNR) and by 6.2% across all SNRs (-5 dB SNR to +15 dB SNR). In the Aurora-2 database recorded with artificially added noise in several environments, the algorithm improves the relative word error rate in almost all situations. Copyright (C) 2007 Bhiksha Raj et al.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Sparse coding of the modulation spectrum for noise-robust automatic speech recognition
    Ahmadi, Sara
    Ahadi, Seyed Mohammad
    Cranen, Bert
    Boves, Lou
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014, : 1 - 20
  • [22] Noise-Robust speech recognition of Conversational Telephone Speech
    Chen, Gang
    Tolba, Hesham
    O'Shaughnessy, Douglas
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1101 - 1104
  • [23] Noise-robust speech recognition based on difference of power spectrum
    Xu, JF
    Wei, G
    ELECTRONICS LETTERS, 2000, 36 (14) : 1247 - 1248
  • [24] Noise-Robust Speech Recognition Based on RBF Neural Network
    Hou, Xuemei
    HIGH PERFORMANCE STRUCTURES AND MATERIALS ENGINEERING, PTS 1 AND 2, 2011, 217-218 : 413 - 418
  • [25] Noise-robust automatic speech recognition using a predictive echo state network
    Skowronski, Mark D.
    Harris, John G.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05): : 1724 - 1730
  • [26] Noise-robust automatic speech recognition using a discriminative echo state network
    Skowronski, Mark D.
    Harris, John G.
    2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 1771 - 1774
  • [27] Coupling identification and reconstruction of missing features for noise-robust automatic speech recognition
    Ma, Ning
    Barker, Jon
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2637 - 2640
  • [28] Mapping Sparse Representation to State Likelihoods in Noise-Robust Automatic Speech Recognition
    Mahkonen, Katariina
    Hurmalainen, Antti
    Virtanen, Tuomas
    Gemmeke, Jort
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 472 - +
  • [29] EXTENDED VTS FOR NOISE-ROBUST SPEECH RECOGNITION
    van Dalen, R. C.
    Gales, M. J. F.
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3829 - 3832
  • [30] Covariance Modelling for Noise-Robust Speech Recognition
    van Dalen, R. C.
    Gales, M. J. F.
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2000 - 2003