Gammatone Wavelet Cepstral Coefficients for Robust Speech Recognition

被引:0
|
作者
Adiga, Aniruddha [1 ]
Magimai-Doss, Mathew [2 ]
Seelamantula, Chandra Sekhar [1 ]
机构
[1] Indian Inst Sci, Dept Elect Engn, Bangalore 560012, Karnataka, India
[2] diap Res Inst, Martigny, Switzerland
基金
瑞士国家科学基金会;
关键词
Gammatone wavelets; Auditory modeling; Cepstral coefficients; Speech recognition; REPRESENTATIONS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We develop noise robust features using Gammatone wavelets derived from the popular Gammatone functions. These wavelets incorporate the characteristics of human peripheral auditory systems, in particular the spatially-varying frequency response of the basilar membrane. We refer to the new features as Gammatone Wavelet Cepstral Coefficients (GWCC). The procedure involved in extracting GWCC from a speech signal is similar to that of the conventional Mel-Frequency Cepstral Coefficients (MFCC) technique, with the difference being in the type of filterbank used. We replace the conventional mel filterbank in MFCC with a Gammatone wavelet filterbank, which we construct using Gammatone wavelets. We also explore the effect of Gammatone filterbank based features (Gammatone Cepstral Coefficients (GCC)) for robust speech recognition. On AURORA 2 database, a comparison of GWCCs and GCCs with MFCCs shows that Gammatone based features yield a better recognition performance at low SNRs.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Whispered speech recognition based on gammatone filterbank cepstral coefficients
    B. Marković
    J. Galić
    Ð. Grozdić
    S. T. Jovičić
    M. Mijić
    Journal of Communications Technology and Electronics, 2017, 62 : 1255 - 1261
  • [2] Whispered Speech Recognition Based on Gammatone Filterbank Cepstral Coefficients
    Markovic, B.
    Galic, J.
    Grozdic, D.
    Jovicic, S. T.
    Mijic, M.
    JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS, 2017, 62 (11) : 1255 - 1261
  • [3] Speech Emotion Recognition Using Gammatone Cepstral Coefficients and Deep Learning Features
    Sharan, Roneel, V
    2023 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLIED NETWORK TECHNOLOGIES, ICMLANT, 2023, : 139 - 142
  • [4] Damped Oscillator Cepstral Coefficients for Robust Speech Recognition
    Mitra, Vikramjit
    Franco, Horacio
    Graciarena, Martin
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 886 - 890
  • [5] WAVELET BASED CEPSTRAL COEFFICIENTS FOR NEURAL NETWORK SPEECH RECOGNITION
    Adam, T. B.
    Salam, M. S.
    Gunawan, T. S.
    2013 IEEE INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING APPLICATIONS (IEEE ICSIPA 2013), 2013, : 447 - 451
  • [6] Recognition of Helicopter Acoustic Signal Based on Gammatone Cepstral Coefficients
    Wang, Yong
    Meng, Hua
    Chen, Zhengwu
    Wei, Chunhua
    Liu, Lei
    Hunan Daxue Xuebao/Journal of Hunan University Natural Sciences, 2021, 48 (06): : 74 - 79
  • [7] DELTA-SPECTRAL CEPSTRAL COEFFICIENTS FOR ROBUST SPEECH RECOGNITION
    Kumar, Kshitiz
    Kim, Chanwoo
    Stern, Richard M.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4784 - 4787
  • [8] Speech Emotion Recognition Based on Coiflet Wavelet Packet Cepstral Coefficients
    Huang, Yongming
    Wu, Ao
    Zhang, Guobao
    Li, Yue
    PATTERN RECOGNITION (CCPR 2014), PT II, 2014, 484 : 436 - 443
  • [9] Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition
    Kim, Chanwoo
    Stern, Richard M.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (07) : 1315 - 1329
  • [10] POWER-NORMALIZED CEPSTRAL COEFFICIENTS (PNCC) FOR ROBUST SPEECH RECOGNITION
    Kim, Chanwoo
    Stern, Richard M.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4101 - 4104