Investigation into a Mel subspace based front-end processing for robust speech recognition

被引:1
|
作者
Selouani, SA [1 ]
O'Shaughnessy, D [1 ]
机构
[1] Univ Moncton, Moncton, NB E1A 3E9, Canada
关键词
speech recognition; neural networks; genetic algorithms; noise reduction;
D O I
10.1109/ISSPIT.2004.1433718
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the issue of noise reduction applied to robust large- vocabulary continuous-speech recognition (CSR). We investigate strategies based on the subspace filtering that has been proven very effective in the area of speech enhancement. We compare original hybrid techniques that combine the Karhonen-Loeve Transform (KLT), Multilayer Perceptron (MLP) and Genetic Algorithms (GAs) in order to get less-variant Mel-frequency parameters. The advantages of these methods include that they do not require estimation of either noise or speech spectra. To evaluate the effecteveness of these methods, an extensive set of recognition experiments are carried out in a severe interfering car noise environmentfor a wide range of SNRs varying from 16 dB to -4 dB using a noisy version of the TIMIT database.
引用
收藏
页码:187 / 190
页数:4
相关论文
共 50 条
  • [31] Comparing Front-End Enhancement Techniques and Multiconditioned Training for Robust Automatic Speech Recognition
    Soni, Meet H.
    Joshi, Sonal
    Panda, Ashish
    TEXT, SPEECH, AND DIALOGUE (TSD 2019), 2019, 11697 : 329 - 340
  • [32] Automatic Speech Recognition with a Cochlear Implant Front-End
    Nogueira, Waldo
    Harczos, Tamas
    Edler, Bernd
    Ostermann, Joern
    Buechner, Andreas
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1993 - +
  • [33] A Front-End Technique for Automatic Noisy Speech Recognition
    Naing, Hay Mar Soe
    Hidayat, Risanuri
    Hartanto, Rudy
    Miyanaga, Yoshikazu
    PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 49 - 54
  • [34] JOINT TRAINING OF FRONT-END AND BACK-END DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Gao, Tian
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4375 - 4379
  • [35] A new perceptually motivated MVDR-based acoustic front-end (PMVDR) for robust automatic speech recognition
    Yapanel, Umit H.
    Hansen, John H. L.
    SPEECH COMMUNICATION, 2008, 50 (02) : 142 - 152
  • [36] A noise-robust front-end based on tree-structured filter-bank for speech recognition
    Kil, RM
    Kim, YI
    Lee, GH
    IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL VI, 2000, : 81 - 86
  • [37] Robust Front-End based on MVA and HEQ post-processing for Arabic Speech Recognition Using Hidden Markov Model Toolkit(HTK)
    Techini, Elhem
    Sakka, Zied
    Bouhlel, MedSalim
    2017 IEEE/ACS 14TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2017, : 815 - 820
  • [38] A noise robust front-end for speech recognition using hough transform and cumulative distribution mapping
    Choi, Eric H. C.
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 286 - +
  • [39] Feature enhancement for a bitstream-based front-end in wireless speech recognition
    Kim, HK
    Cox, RV
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 241 - 244
  • [40] A noise robust front-end with low computational cost for embedded in-car speech recognition
    Ding, Pei
    He, Lei
    Yan, Xiang
    Zhao, Rui
    Hao, Jie
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1045 - +