Combining speech enhancement and auditory feature extraction for robust speech recognition

被引:40
|
作者
Kleinschmidt, M [1 ]
Tchorz, J [1 ]
Kollmeier, B [1 ]
机构
[1] Carl von Ossietzky Univ Oldenburg, AG Med Phys, D-26111 Oldenburg, Germany
关键词
robust speech recognition; perceptive modeling; auditory front end; speech enhancement;
D O I
10.1016/S0167-6393(00)00047-9
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A major deficiency in state-of-the-art automatic speech recognition (ASR) systems is the lack of robustness in additive and convolutional noise. The model of auditory perception (PEMO), developed by Dau et al. (T. Dau, D. Puschel, A. Kohlrausch, J. Acoust. Sec. Am. 99 (6) (1996) 3615-3622) for psychoacoustical purposes, partly overcomes these difficulties when used as a front end for automatic speech recognition. To further improve the performance of this auditory-based recognition system in background noise, different speech enhancement methods were examined, which have been evaluated in earlier studies as components of digital hearing aids. Monaural noise reduction, as proposed by Ephraim and Malah (Y. Ephraim, D. Malah, IEEE Trans. Acoust. Speech Signal Process. ASSP-32 (6) (1984) 1109-1121) was compared to a binaural filter and dereverberation algorithm after Wittkop et al. (T. Wittkop, S. Albani, V. Hohmann, J. Peissig, W. Woods, B. Kollmeier, Acustica United with Acta Acustica 83 (4) (1997) 684- 699). Both noise reduction algorithms yield improvements in recognition performance equivalent to up to 10 dB SNR in non-reverberant conditions for all types of noise, while the performance in clean speech is not significantly affected. Even in real-world reverberant conditions the speech enhancement schemes lead to improvements in recognition performance comparable to an SNR gain of up to 5 dB. This effect exceeds the expectations as earlier studies found no increase in speech intelligibility for hearing-impaired human subjects. (C) 2001 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:75 / 91
页数:17
相关论文
共 50 条
  • [41] Speech recognition as feature extraction for speaker recognition
    Stolcke, A.
    Shriberg, E.
    Ferrer, L.
    Kajarekar, S.
    Sonmez, K.
    Tur, G.
    2007 IEEE WORKSHOP ON SIGNAL PROCESSING APPLICATIONS FOR PUBLIC SECURITY AND FORENSICS, 2007, : 39 - +
  • [42] Robust speech recognition method based on discriminative environment feature extraction
    Han, JQ
    Gao, W
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2001, 16 (05) : 458 - 464
  • [43] Wavelet-based denoising for robust feature extraction for speech recognition
    Farooq, O
    Datta, S
    ELECTRONICS LETTERS, 2003, 39 (01) : 163 - 165
  • [44] Robust endpoint detection for speech recognition based on discriminative feature extraction
    Yamamoto, Koichi
    Jabloun, Firas
    Reinhard, Klaus
    Kawamura, Akinori
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 805 - 808
  • [45] Robust Feature Extraction for Speech Recognition Based on Perceptually Motivated MUSIC
    Han Zhi-yan
    Wang Jian
    PROCEEDINGS 2010 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, (ICCSIT 2010), VOL 1, 2010, : 98 - 102
  • [46] Robust Speech Recognition Method Based on Discriminative Environment Feature Extraction
    韩纪庆
    高文
    Journal of Computer Science and Technology, 2001, (05) : 458 - 464
  • [47] Filterbank Analysis of MFCC Feature Extraction in Robust Children Speech Recognition
    Naing, Hay Mar Soe
    Miyanaga, Yoshikazu
    Hidayat, Risanuri
    Winduratna, Bondhan
    2019 INTERNATIONAL SYMPOSIUM ON MULTIMEDIA AND COMMUNICATION TECHNOLOGY (ISMAC), 2019,
  • [48] Synchrony-Based Feature Extraction for Robust Automatic Speech Recognition
    de-La-Calle-Silos, Fernando
    Stern, Richard M.
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (08) : 1158 - 1162
  • [49] Robust speech recognition method based on discriminative environment feature extraction
    Jiqing Han
    Wen Gao
    Journal of Computer Science and Technology, 2001, 16 : 458 - 464
  • [50] Robust speech recognition and feature extraction using HMM2
    Weber, K
    Ikbal, S
    Bengio, S
    Bourlard, H
    COMPUTER SPEECH AND LANGUAGE, 2003, 17 (2-3): : 195 - 211