Feature Extraction Based on Pitch-Synchronous Averaging for Robust Speech Recognition

被引:9
|
作者
Morales-Cordovilla, Juan A. [1 ]
Peinado, Antonio M. [1 ]
Sanchez, Victoria [1 ]
Gonzalez, Jose A. [1 ]
机构
[1] Univ Granada, Dept Teoria Senal Telemat & Comunicac, E-18071 Granada, Spain
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2011年 / 19卷 / 03期
关键词
Acoustic noise; autocorrelation-based mel frequency cepstral coefficient (AMFCC); autocorrelation estimation; pitch-synchronous analysis; robust speech recognition;
D O I
10.1109/TASL.2010.2053846
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose two estimators for the autocorrelation sequence of a periodic signal in additive noise. Both estimators are formulated employing tables which contain all the possible products of sample pairs in a speech signal frame. The first estimator is based on a pitch-synchronous averaging. This estimator is statistically analyzed and we show that the signal-to-noise ratio (SNR) can be increased up to a factor equal to the number of available periods. The second estimator is similar to the former one but it avoids the use of those sample products more likely affected by noise. We prove that, under certain conditions, this estimator can remove the effect of an additive noise in a statistical sense. Both estimators are employed to extract mel frequency cepstral coefficients (MFCCs) as features for robust speech recognition. Although these estimators are initially conceived for voiced speech frames, we extend their application to unvoiced sounds in order to obtain a coherent feature extractor. The experimental results show the superiority of the proposed approach over other MFCC-based front-ends such as the higher-lag autocorrelation spectrum estimation (HASE), which also employs the idea of avoiding those autocorrelation coefficients more likely affected by noise.
引用
收藏
页码:640 / 651
页数:12
相关论文
共 50 条
  • [21] Speech feature extraction based on wavelet modulation scale for robust speech recognition
    Ma, Xin
    Zhou, Weidong
    Ju, Fang
    Jiang, Qi
    NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2006, 4233 : 499 - 505
  • [22] Geometrical feature extraction for robust speech recognition
    Li, Xiaokun
    Kwan, Chiman
    2005 39TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1 AND 2, 2005, : 558 - 562
  • [23] Robust speech recognition method based on discriminative environment feature extraction
    Han, JQ
    Gao, W
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2001, 16 (05) : 458 - 464
  • [24] Robust endpoint detection for speech recognition based on discriminative feature extraction
    Yamamoto, Koichi
    Jabloun, Firas
    Reinhard, Klaus
    Kawamura, Akinori
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 805 - 808
  • [25] Wavelet-based denoising for robust feature extraction for speech recognition
    Farooq, O
    Datta, S
    ELECTRONICS LETTERS, 2003, 39 (01) : 163 - 165
  • [26] Robust Feature Extraction for Speech Recognition Based on Perceptually Motivated MUSIC
    Han Zhi-yan
    Wang Jian
    PROCEEDINGS 2010 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, (ICCSIT 2010), VOL 1, 2010, : 98 - 102
  • [27] Robust Speech Recognition Method Based on Discriminative Environment Feature Extraction
    韩纪庆
    高文
    Journal of Computer Science and Technology, 2001, (05) : 458 - 464
  • [28] Synchrony-Based Feature Extraction for Robust Automatic Speech Recognition
    de-La-Calle-Silos, Fernando
    Stern, Richard M.
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (08) : 1158 - 1162
  • [29] Robust speech recognition method based on discriminative environment feature extraction
    Jiqing Han
    Wen Gao
    Journal of Computer Science and Technology, 2001, 16 : 458 - 464
  • [30] PITCH-SYNCHRONOUS COMPUTED 3-DI-MENSIONAL SPEECH SPECTROGRAMS
    AUTH, W
    LACROIX, A
    NACHRICHTENTECHNISCHE ZEITSCHRIFT, 1971, 24 (10): : 502 - &