Non-Negative Subspace Projection During Conventional MFCC Feature Extraction for Noise Robust Speech Recognition

被引:0
|
作者
Kumar, D. S. Pavan [1 ]
Bilgi, Raghavendra R. [1 ]
Umesh, S. [1 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Madras 600036, Tamil Nadu, India
关键词
Speech recognition; noise robustness; non-negative matrix factorization; Mel-frequency cepstral coefficients;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An additional feature processing algorithm using Non-negative Matrix Factorization (NMF) is proposed to be included during the conventional extraction of Mel-frequency cepstral coefficients (MFCC) for achieving noise robustness in HMM based speech recognition. The proposed approach reconstructs log-Mel filterbank outputs of speech data from a set of building blocks that form the bases of a speech subspace. The bases are learned using the standard NMF of training data. A variation of learning the bases is proposed, which uses histogram equalized activation coefficients during training, to achieve noise robustness. The proposed methods give up to 5.96% absolute improvement in recognition accuracy on Aurora-2 task over a baseline with standard MFCCs, and up to 13.69% improvement when combined with other feature normalization techniques like Histogram Equalization (HEQ) and Heteroscedastic Linear Discriminant Analysis (HLDA).
引用
收藏
页数:5
相关论文
共 50 条
  • [41] Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition
    Leutnant, Volker
    Krueger, Alexander
    Haeb-Umbach, Reinhold
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (08): : 1640 - 1652
  • [42] The perceptual wavelet feature for noise robust Vietnamese speech recognition
    Trung, Nguyen Quoc
    Nghia, Phung Trung
    2008 SECOND INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS, 2008, : 255 - +
  • [43] Feature domain compensation of nonstationary noise for robust speech recognition
    Kim, NS
    SPEECH COMMUNICATION, 2002, 37 (3-4) : 231 - 248
  • [44] Non-negative Feature Extraction using Conjugate Gradient Method
    Zhang, Jiawen
    Chen, Wen-Sheng
    Pan, Binbin
    2019 15TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS 2019), 2019, : 402 - 405
  • [45] Non-negative subspace feature representation for few-shot learning in medical
    Fan, Keqiang
    Cai, Xiaohao
    Niranjan, Mahesan
    IMAGE AND VISION COMPUTING, 2024, 152
  • [46] Non-negative Matrix Factorization: Robust Extraction of Extended Structures
    Ren, Bin
    Pueyo, Laurent
    Ben Zhu, Guangtun
    Debes, John
    Duchene, Gaspard
    ASTROPHYSICAL JOURNAL, 2018, 852 (02):
  • [47] Robust Feature Extraction Methods for Speech Recognition in Noisy Environments
    Mukheolkar, Ajinkya Sunil
    Alex, John Sahaya Rani
    2014 FIRST INTERNATIONAL CONFERENCE ON NETWORKS & SOFT COMPUTING (ICNSC), 2014, : 295 - 299
  • [48] A bio-inspired feature extraction for robust speech recognition
    Zouhir, Youssef
    Ouni, Kais
    SPRINGERPLUS, 2014, 3
  • [49] Temporal modulation normalization for robust speech feature extraction and recognition
    Lu, Xugang
    Matsuda, Shigeki
    Unoki, Masashi
    Nakamura, Satoshi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2011, 52 (01) : 187 - 199
  • [50] Temporal modulation normalization for robust speech feature extraction and recognition
    Xugang Lu
    Shigeki Matsuda
    Masashi Unoki
    Satoshi Nakamura
    Multimedia Tools and Applications, 2011, 52 : 187 - 199