Gammatone Wavelet Cepstral Coefficients for Robust Speech Recognition

被引:0
|
作者
Adiga, Aniruddha [1 ]
Magimai-Doss, Mathew [2 ]
Seelamantula, Chandra Sekhar [1 ]
机构
[1] Indian Inst Sci, Dept Elect Engn, Bangalore 560012, Karnataka, India
[2] diap Res Inst, Martigny, Switzerland
基金
瑞士国家科学基金会;
关键词
Gammatone wavelets; Auditory modeling; Cepstral coefficients; Speech recognition; REPRESENTATIONS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We develop noise robust features using Gammatone wavelets derived from the popular Gammatone functions. These wavelets incorporate the characteristics of human peripheral auditory systems, in particular the spatially-varying frequency response of the basilar membrane. We refer to the new features as Gammatone Wavelet Cepstral Coefficients (GWCC). The procedure involved in extracting GWCC from a speech signal is similar to that of the conventional Mel-Frequency Cepstral Coefficients (MFCC) technique, with the difference being in the type of filterbank used. We replace the conventional mel filterbank in MFCC with a Gammatone wavelet filterbank, which we construct using Gammatone wavelets. We also explore the effect of Gammatone filterbank based features (Gammatone Cepstral Coefficients (GCC)) for robust speech recognition. On AURORA 2 database, a comparison of GWCCs and GCCs with MFCCs shows that Gammatone based features yield a better recognition performance at low SNRs.
引用
收藏
页数:4
相关论文
共 50 条
  • [41] A Cepstral PDF Normalization Method for Noise Robust Speech Recognition
    Suk, Yong Ho
    Choi, Seung Ho
    ADVANCES IN COMPUTER SCIENCE, ENVIRONMENT, ECOINFORMATICS, AND EDUCATION, PT II, 2011, 215 : 34 - +
  • [42] Gammatone Frequency Cepstral Coefficients for Speaker Identification over VoIP Networks
    Bouziane, Ayoub
    Kharroubi, Jamal
    Zarghili, Arsalane
    2016 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY FOR ORGANIZATIONS DEVELOPMENT (IT4OD), 2016,
  • [43] Combined Waveform-Cepstral Representation for Robust Speech Recognition
    Ager, Matthew
    Cvetkovic, Zoran
    Sollich, Peter
    2011 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY PROCEEDINGS (ISIT), 2011, : 864 - 868
  • [44] Multichannel Cepstral Domain Feature Warping for Robust Speech Recognition
    Squartini, Stefano
    Fagiani, Marco
    Principi, Emanuele
    Piazza, Francesco
    NEURAL NETS WIRN10, 2011, 226 : 284 - 292
  • [45] A late fusion deep neural network for robust speaker identification using raw waveforms and gammatone cepstral coefficients
    Salvati, Daniele
    Drioli, Carlo
    Foresti, Gian Luca
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 222
  • [46] Robust Automatic Speech Recognition Features using Complex Wavelet Packet Transform Coefficients
    Sen, Tjong Wan
    Trilaksono, Bambang Riyanto
    Arman, Arry Akhmad
    Mandala, Rila
    JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2009, 3 (02) : 123 - 134
  • [47] Noise robust Chinese speech recognition using feature vector normalization and higher-order cepstral coefficients
    Wang, X
    Dong, Y
    Häkkinen, J
    Viikki, O
    2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 738 - 741
  • [48] Wavelet cesptral coefficients for isolated speech recognition
    Adam, T.B. (tarmizi_adam2005@yahoo.com), 1600, Universitas Ahmad Dahlan (11):
  • [49] Spectral peak-weighted liftering of cepstral coefficients for speech recognition
    Kim, HK
    Lee, HS
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2000, E83D (07) : 1540 - 1549
  • [50] SUBSPACE PROJECTION CEPSTRAL COEFFICIENTS FOR NOISE ROBUST ACOUSTIC EVENT RECOGNITION
    Park, Sangwook
    Lee, Younglo
    Han, David K.
    Ko, Hanseok
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 761 - 765