An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition

被引:4
|
作者
Raj, Bhiksha [1 ]
Turicchia, Lorenzo [2 ]
Schmidt-Nielsen, Bent [1 ]
Sarpeshkar, Rahul [2 ]
机构
[1] MERL, Cambridge, MA 02139 USA
[2] MIT, Cambridge, MA 02139 USA
关键词
Error Rate; Acoustics; Recognition Task; Recognition Performance; Auditory System;
D O I
10.1155/2007/65420
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We describe an FFT-based companding algorithm for preprocessing speech before recognition. The algorithm mimics tone-to-tone suppression and masking in the auditory system to improve automatic speech recognition performance in noise. Moreover, it is also very computationally efficient and suited to digital implementations due to its use of the FFT. In an automotive digits recognition task with the CU-Move database recorded in real environmental noise, the algorithm improves the relative word error by 12.5% at -5 dB signal-to-noise ratio (SNR) and by 6.2% across all SNRs (-5 dB SNR to +15 dB SNR). In the Aurora-2 database recorded with artificially added noise in several environments, the algorithm improves the relative word error rate in almost all situations. Copyright (C) 2007 Bhiksha Raj et al.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] An Efficient Noise-Robust Automatic Speech Recognition System using Artificial Neural Networks
    Gupta, Santosh
    Bhurchandi, Kishor M.
    Keskar, Avinash G.
    2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), VOL. 1, 2016, : 1873 - 1877
  • [42] An engineering model of the masking for the noise-robust speech recognition
    Park, KY
    Lee, SY
    NEUROCOMPUTING, 2003, 52-4 : 615 - 620
  • [43] Two-stage deep spectrum fusion for noise-robust end-to-end speech recognition
    Fan, Cunhang
    Ding, Mingming
    Yi, Jiangyan
    Li, Jinpeng
    Lv, Zhao
    APPLIED ACOUSTICS, 2023, 212
  • [44] Incorporating a Generative Front-end Layer to Deep Neural Network for Noise Robust Automatic Speech Recognition
    Kundu, Souvik
    Sim, Khe Chai
    Gales, Mark
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2359 - 2363
  • [45] A NOISE-ROBUST SELF-SUPERVISED PRE-TRAINING MODEL BASED SPEECH REPRESENTATION LEARNING FOR AUTOMATIC SPEECH RECOGNITION
    Zhu, Qiu-Shi
    Zhang, Jie
    Zhang, Zi-Qiang
    Wu, Ming-Hui
    Fang, Xin
    Dai, Li-Rong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3174 - 3178
  • [46] Front-End Feature Compensation for Noise Robust Speech Emotion Recognition
    Pandharipande, Meghna
    Chakraborty, Rupayan
    Panda, Ashish
    Das, Biswajit
    Kopparapu, Sunil Kumar
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [47] Fusion Feature Extraction Based on Auditory and Energy for Noise-Robust Speech Recognition
    Shi, Yanyan
    Bai, Jing
    Xue, Peiyun
    Shi, Dianxi
    IEEE ACCESS, 2019, 7 : 81911 - 81922
  • [48] Noise-robust speech recognition in mobile network based on convolution neural networks
    Bouchakour, Lallouani
    Debyeche, Mohamed
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2022, 25 (01) : 269 - 277
  • [49] Noise-robust speech recognition in mobile network based on convolution neural networks
    Lallouani Bouchakour
    Mohamed Debyeche
    International Journal of Speech Technology, 2022, 25 : 269 - 277
  • [50] Cluster-Based Pairwise Contrastive Loss for Noise-Robust Speech Recognition
    Lee, Geon Woo
    Kim, Hong Kook
    SENSORS, 2024, 24 (08)