Enhancing the magnitude spectrum of speech features for robust speech recognition

被引:1
|
作者
Hung, Jeih-weih [1 ]
Fan, Hao-teng [1 ]
Tu, Wen-hsiang [1 ]
机构
[1] Natl Chi Nan Univ, Dept Elect Engn, Puli, Taiwan
关键词
Voice activity detection; Robust speech recognition; Speech enhancement; NOISE; COMPENSATION;
D O I
10.1186/1687-6180-2012-189
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this article, we present an effective compensation scheme to improve noise robustness for the spectra of speech signals. In this compensation scheme, called magnitude spectrum enhancement (MSE), a voice activity detection (VAD) process is performed on the frame sequence of the utterance. The magnitude spectra of non-speech frames are then reduced while those of speech frames are amplified. In experiments conducted on the Aurora-2 noisy digits database, MSE achieves an error reduction rate of nearly 42% relative to baseline processing. This method outperforms well-known spectral-domain speech enhancement techniques, including spectral subtraction (SS) and Wiener filtering (WF). In addition, the proposed MSE can be integrated with cepstral-domain robustness methods, such as mean and variance normalization (MVN) and histogram normalization (HEQ), to achieve further improvements in recognition accuracy under noise-corrupted environments.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Enhancing the magnitude spectrum of speech features for robust speech recognition
    Jeih-weih Hung
    Hao-teng Fan
    Wen-hsiang Tu
    EURASIP Journal on Advances in Signal Processing, 2012
  • [2] Magnitude Spectrum Enhancement for Robust Speech Recognition
    Tu, Wen-hsiang
    Hung, Jeih-weih
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4586 - 4589
  • [3] Robust Feature Extraction for Speech Recognition by Enhancing Auditory Spectrum
    Alam, Md Jahangir
    Kenny, Patrick
    O'Shaughnessy, Douglas
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1358 - 1361
  • [4] Deep Scattering Power Spectrum Features for Robust Speech Recognition
    Joy, Neethu M.
    Oglic, Dino
    Cvetkovic, Zoran
    Bell, Peter
    Renals, Steve
    INTERSPEECH 2020, 2020, : 1673 - 1677
  • [5] Normalizing the speech modulation spectrum for robust speech recognition
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1021 - +
  • [6] Noise Robust Speech Features for Automatic Continuous Speech Recognition using Running Spectrum Analysis
    Ohnuki, Kazunaga
    Takahashi, Wataru
    Yoshizawa, Shingo
    Miyanaga, Yoshikazu
    2008 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, 2008, : 150 - 153
  • [7] Speech Emotion Recognition Using Magnitude and Phase Features
    Shankar D.R.
    Manjula R.B.
    Biradar R.C.
    SN Computer Science, 5 (5)
  • [8] Enhancing Spontaneous Speech Recognition with BLSTM Features
    Woellmer, Martin
    Schuller, Bjoern
    ADVANCES IN NONLINEAR SPEECH PROCESSING, 2011, 7015 : 17 - 24
  • [9] Histogram equalization of contextual statistics of speech features for robust speech recognition
    Hsieh, Hsin-Ju
    Chen, Berlin
    Hung, Jeih-weih
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (17) : 6769 - 6795
  • [10] Histogram equalization of contextual statistics of speech features for robust speech recognition
    Hsin-Ju Hsieh
    Berlin Chen
    Jeih-weih Hung
    Multimedia Tools and Applications, 2015, 74 : 6769 - 6795