Non-Negative Subspace Projection During Conventional MFCC Feature Extraction for Noise Robust Speech Recognition

被引:0
|
作者
Kumar, D. S. Pavan [1 ]
Bilgi, Raghavendra R. [1 ]
Umesh, S. [1 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Madras 600036, Tamil Nadu, India
关键词
Speech recognition; noise robustness; non-negative matrix factorization; Mel-frequency cepstral coefficients;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An additional feature processing algorithm using Non-negative Matrix Factorization (NMF) is proposed to be included during the conventional extraction of Mel-frequency cepstral coefficients (MFCC) for achieving noise robustness in HMM based speech recognition. The proposed approach reconstructs log-Mel filterbank outputs of speech data from a set of building blocks that form the bases of a speech subspace. The bases are learned using the standard NMF of training data. A variation of learning the bases is proposed, which uses histogram equalized activation coefficients during training, to achieve noise robustness. The proposed methods give up to 5.96% absolute improvement in recognition accuracy on Aurora-2 task over a baseline with standard MFCCs, and up to 13.69% improvement when combined with other feature normalization techniques like Histogram Equalization (HEQ) and Heteroscedastic Linear Discriminant Analysis (HLDA).
引用
收藏
页数:5
相关论文
共 50 条
  • [1] NON-NEGATIVE MATRIX FACTORIZATION AS NOISE-ROBUST FEATURE EXTRACTOR FOR SPEECH RECOGNITION
    Schuller, Bjoern
    Weninger, Felix
    Woellmer, Martin
    Sun, Yang
    Rigoll, Gerhard
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4562 - 4565
  • [2] NON-NEGATIVE MATRIX DECONVOLUTION IN NOISE ROBUST SPEECH RECOGNITION
    Hurmalainen, Antti
    Gemmeke, Jort
    Virtanen, Tuomas
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4588 - 4591
  • [3] Filterbank Analysis of MFCC Feature Extraction in Robust Children Speech Recognition
    Naing, Hay Mar Soe
    Miyanaga, Yoshikazu
    Hidayat, Risanuri
    Winduratna, Bondhan
    2019 INTERNATIONAL SYMPOSIUM ON MULTIMEDIA AND COMMUNICATION TECHNOLOGY (ISMAC), 2019,
  • [4] Robust Discriminative Non-Negative and Symmetric Low-Rank Projection Learning for Feature Extraction
    Zhang, Wentao
    Chen, Xiuhong
    SYMMETRY-BASEL, 2025, 17 (02):
  • [5] Non-linear feature extraction for robust speech recognition in stationary and non-stationary noise
    Zhu, QF
    Alwan, A
    COMPUTER SPEECH AND LANGUAGE, 2003, 17 (04): : 381 - 402
  • [6] Hardware Implementation of MFCC Feature Extraction for Speech Recognition on FPGA
    Van-Lan Dao
    Van-Danh Nguyen
    Hai-Duong Nguyen
    Van-Phuc Hoang
    ADVANCES IN INFORMATION AND COMMUNICATION TECHNOLOGY, 2017, 538 : 248 - 254
  • [7] Feature extraction for robust speech recognition
    Dharanipragada, S
    2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, PROCEEDINGS, 2002, : 855 - 858
  • [8] A Modified MFCC Feature Extraction Technique For Robust Speaker Recognition
    Sharma, Diksha
    Ali, Israj
    2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2015, : 1052 - 1057
  • [9] Robust non-negative matrix factorization for subspace learning
    Dai, Xiangguang
    Tao, Yingyin
    Zhang, Wei
    Feng, Yuming
    ITALIAN JOURNAL OF PURE AND APPLIED MATHEMATICS, 2020, (44): : 511 - 520
  • [10] Robust non-negative matrix factorization for subspace learning
    School of Three Gorges Artificial Intelligence, Chongqing Three Gorges University, Wanzhou, Chongqing
    404100, China
    不详
    404100, China
    不详
    404100, China
    Ital. J. Pure Appl. Math., 2020, (511-520): : 511 - 520