Enhancing the magnitude spectrum of speech features for robust speech recognition

被引:1
|
作者
Hung, Jeih-weih [1 ]
Fan, Hao-teng [1 ]
Tu, Wen-hsiang [1 ]
机构
[1] Natl Chi Nan Univ, Dept Elect Engn, Puli, Taiwan
关键词
Voice activity detection; Robust speech recognition; Speech enhancement; NOISE; COMPENSATION;
D O I
10.1186/1687-6180-2012-189
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this article, we present an effective compensation scheme to improve noise robustness for the spectra of speech signals. In this compensation scheme, called magnitude spectrum enhancement (MSE), a voice activity detection (VAD) process is performed on the frame sequence of the utterance. The magnitude spectra of non-speech frames are then reduced while those of speech frames are amplified. In experiments conducted on the Aurora-2 noisy digits database, MSE achieves an error reduction rate of nearly 42% relative to baseline processing. This method outperforms well-known spectral-domain speech enhancement techniques, including spectral subtraction (SS) and Wiener filtering (WF). In addition, the proposed MSE can be integrated with cepstral-domain robustness methods, such as mean and variance normalization (MVN) and histogram normalization (HEQ), to achieve further improvements in recognition accuracy under noise-corrupted environments.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Instantaneous Frequency Features for Noise Robust Speech Recognition
    Nayak, Shekhar
    Dhar, Shashank B.
    Bhati, Saurabhchand
    Bramhendra, Koilakuntla
    Murty, K. Sri Rama
    2019 25TH NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2019,
  • [42] Combining Binaural and Cortical Features for Robust Speech Recognition
    Spille, Constantin
    Kollmeier, Birger
    Meyer, Bernd T.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 756 - 767
  • [43] Investigation of robust features for speech recognition in hostile environments
    Toh, AM
    Togneri, R
    Nordholm, S
    2005 Asia-Pacific Conference on Communications (APCC), Vols 1& 2, 2005, : 956 - 960
  • [44] Multiband, Multisensor Robust Features for Noisy Speech Recognition
    Dimitriadis, Dimitrios
    Maragos, Petros
    Lefkimmiatis, Stamatios
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 889 - 892
  • [45] Multistream Bandpass Modulation Features for Robust Speech Recognition
    Nemala, Sridhar Krishna
    Patil, Kailash
    Elhilali, Mounya
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1284 - 1287
  • [46] Statistical estimation of unreliable features for robust speech recognition
    Renevey, P
    Drygajlo, A
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1731 - 1734
  • [47] Robust Speech Recognition Combining Cepstral and Articulatory Features
    Zha, Zhuan-ling
    Hu, Jin
    Zhan, Qing-ran
    Shan, Ya-hui
    Xie, Xiang
    Wang, Jing
    Cheng, Hao-bo
    PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 1401 - 1405
  • [48] Robust speech detection based on phoneme recognition features
    Mihelic, France
    Zibert, Janez
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2006, 4188 : 455 - 462
  • [49] Robust Speech Recognition via Enhancing the Complex-Valued Acoustic Spectrum in Modulation Domain
    Hung, Jeih-Weih
    Hsieh, Hsin-Ju
    Chen, Berlin
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (02) : 236 - 251
  • [50] Study on the Use and Adaptation of Bottleneck Features for Robust Speech Recognition of Nonlinearly Distorted Speech
    Malek, Jiri
    Cerva, Petr
    Seps, Ladislav
    Nouza, Jan
    SIGMAP: PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON E-BUSINESS AND TELECOMMUNICATIONS - VOL. 5, 2016, : 65 - 71