Non-Negative Subspace Projection During Conventional MFCC Feature Extraction for Noise Robust Speech Recognition

被引:0
|
作者
Kumar, D. S. Pavan [1 ]
Bilgi, Raghavendra R. [1 ]
Umesh, S. [1 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Madras 600036, Tamil Nadu, India
关键词
Speech recognition; noise robustness; non-negative matrix factorization; Mel-frequency cepstral coefficients;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An additional feature processing algorithm using Non-negative Matrix Factorization (NMF) is proposed to be included during the conventional extraction of Mel-frequency cepstral coefficients (MFCC) for achieving noise robustness in HMM based speech recognition. The proposed approach reconstructs log-Mel filterbank outputs of speech data from a set of building blocks that form the bases of a speech subspace. The bases are learned using the standard NMF of training data. A variation of learning the bases is proposed, which uses histogram equalized activation coefficients during training, to achieve noise robustness. The proposed methods give up to 5.96% absolute improvement in recognition accuracy on Aurora-2 task over a baseline with standard MFCCs, and up to 13.69% improvement when combined with other feature normalization techniques like Histogram Equalization (HEQ) and Heteroscedastic Linear Discriminant Analysis (HLDA).
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Improved MFCC feature extraction by PCA-optimized filterbank for speech recognition
    Lee, SM
    Fang, SH
    Hung, JW
    Lee, LS
    ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 49 - 52
  • [22] Orthogonalized distinctive phonetic feature extraction for noise-robust automatic speech recognition
    Fukuda, T
    Nitta, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (05): : 1110 - 1118
  • [23] Assessment of signal subspace based speech enhancement for noise robust speech recognition
    Hermus, K
    Wambacq, P
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 945 - 948
  • [24] SUBSPACE PROJECTION CEPSTRAL COEFFICIENTS FOR NOISE ROBUST ACOUSTIC EVENT RECOGNITION
    Park, Sangwook
    Lee, Younglo
    Han, David K.
    Ko, Hanseok
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 761 - 765
  • [25] MVDR based feature extraction for robust speech recognition
    Dharanipragada, S
    Rao, BD
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 309 - 312
  • [26] Modified feature extraction methods in robust speech recognition
    Rajnoha, Josef
    Pollak, Petr
    2007 17TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA, VOLS 1 AND 2, 2007, : 337 - +
  • [27] Discriminative temporal feature extraction for robust speech recognition
    Shen, JL
    ELECTRONICS LETTERS, 1997, 33 (19) : 1598 - 1600
  • [28] Distinctive phonetic feature extraction for robust speech recognition
    Fukuda, T
    Yamamoto, W
    Nitta, T
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 25 - 28
  • [29] LEARNING SPEECH FEATURES IN THE PRESENCE OF NOISE: SPARSE CONVOLUTIVE ROBUST NON-NEGATIVE MATRIX FACTORIZATION
    de Frein, Ruairi
    Rickard, Scott T.
    2009 16TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, VOLS 1 AND 2, 2009, : 1248 - 1253
  • [30] Robust non-negative supervised low-rank discriminant embedding (NSLRDE) for feature extraction
    Wan, Minghua
    Yan, Chengxu
    Zhan, Tianming
    Tan, Hai
    Yang, Guowei
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (06) : 2155 - 2168