Non-Negative Subspace Projection During Conventional MFCC Feature Extraction for Noise Robust Speech Recognition

被引:0
|
作者
Kumar, D. S. Pavan [1 ]
Bilgi, Raghavendra R. [1 ]
Umesh, S. [1 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Madras 600036, Tamil Nadu, India
关键词
Speech recognition; noise robustness; non-negative matrix factorization; Mel-frequency cepstral coefficients;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An additional feature processing algorithm using Non-negative Matrix Factorization (NMF) is proposed to be included during the conventional extraction of Mel-frequency cepstral coefficients (MFCC) for achieving noise robustness in HMM based speech recognition. The proposed approach reconstructs log-Mel filterbank outputs of speech data from a set of building blocks that form the bases of a speech subspace. The bases are learned using the standard NMF of training data. A variation of learning the bases is proposed, which uses histogram equalized activation coefficients during training, to achieve noise robustness. The proposed methods give up to 5.96% absolute improvement in recognition accuracy on Aurora-2 task over a baseline with standard MFCCs, and up to 13.69% improvement when combined with other feature normalization techniques like Histogram Equalization (HEQ) and Heteroscedastic Linear Discriminant Analysis (HLDA).
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Robust non-negative supervised low-rank discriminant embedding (NSLRDE) for feature extraction
    Minghua Wan
    Chengxu Yan
    Tianming Zhan
    Hai Tan
    Guowei Yang
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 2155 - 2168
  • [32] Combining speech enhancement and auditory feature extraction for robust speech recognition
    Kleinschmidt, M
    Tchorz, J
    Kollmeier, B
    SPEECH COMMUNICATION, 2001, 34 (1-2) : 75 - 91
  • [33] Use of novel feature extraction technique with subspace classifiers for speech recognition
    Gunal, Serkan
    Edizkan, Rifat
    2007 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE SERVICES, 2007, : 80 - +
  • [34] Hierarchical Speech Recognition System Using MFCC Feature Extraction and Dynamic Spiking RSOM
    Tarek, Behi
    Najet, Arous
    Noureddine, Ellouze
    2014 15TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2014, : 41 - 46
  • [35] A New Feature Extraction and Recognition Method for Microexpression Based on Local Non-negative Matrix Factorization
    Gao, Junli
    Chen, Huajun
    Zhang, Xiaohua
    Guo, Jing
    Liang, Wenyu
    FRONTIERS IN NEUROROBOTICS, 2020, 14
  • [36] A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition
    Kris Hermus
    Patrick Wambacq
    Hugo Van hamme
    EURASIP Journal on Advances in Signal Processing, 2007
  • [37] A review of signal subspace speech enhancement and its application to noise robust speech recognition
    Hermus, Kris
    Wambacq, Patrick
    Van hamme, Hugo
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2007, 2007 (1)
  • [38] Robust Non-negative Matrix Factorization with β-Divergence for Speech Separation
    Li, Yinan
    Zhang, Xiongwei
    Sun, Meng
    ETRI JOURNAL, 2017, 39 (01) : 21 - 29
  • [39] FULLY SUPERVISED NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE EXTRACTION
    Austin, Woody
    Anderson, Dylan
    Ghosh, Joydeep
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 5772 - 5775
  • [40] Noise robust speech parameterization using multiresolution feature extraction
    Hariharan, R
    Kiss, I
    Viikki, O
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (08): : 856 - 865