Investigation into a Mel subspace based front-end processing for robust speech recognition

被引:1
|
作者
Selouani, SA [1 ]
O'Shaughnessy, D [1 ]
机构
[1] Univ Moncton, Moncton, NB E1A 3E9, Canada
关键词
speech recognition; neural networks; genetic algorithms; noise reduction;
D O I
10.1109/ISSPIT.2004.1433718
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the issue of noise reduction applied to robust large- vocabulary continuous-speech recognition (CSR). We investigate strategies based on the subspace filtering that has been proven very effective in the area of speech enhancement. We compare original hybrid techniques that combine the Karhonen-Loeve Transform (KLT), Multilayer Perceptron (MLP) and Genetic Algorithms (GAs) in order to get less-variant Mel-frequency parameters. The advantages of these methods include that they do not require estimation of either noise or speech spectra. To evaluate the effecteveness of these methods, an extensive set of recognition experiments are carried out in a severe interfering car noise environmentfor a wide range of SNRs varying from 16 dB to -4 dB using a noisy version of the TIMIT database.
引用
收藏
页码:187 / 190
页数:4
相关论文
共 50 条
  • [41] Recognizing voice aver IP:: A robust front-end for speech recognition on the World Wide Web
    Peláez-Moreno, C
    Gallardo-Antolín, A
    Díaz-De-María, F
    IEEE TRANSACTIONS ON MULTIMEDIA, 2001, 3 (02) : 209 - 218
  • [42] Robust automatic speech recognition using a multi-channel signal separation front-end
    Yen, KC
    Zhao, YX
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1337 - 1340
  • [43] A robust front-end processor combining mel frequency cepstral coefficient and sub-band spectral centroid histogram methods for automatic speech recognition
    Department of Information Technology Kongu Engineering College, Perundurai - 638 052, Erode, Tamilnadu State, India
    不详
    Int. J. Signal Process. Image Process. Pattern Recogn., 2008, 2 (67-74):
  • [44] Thin client front-end processor for distributed speech recognition
    Chow, KF
    Liew, SC
    Lua, KT
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 29 - 32
  • [45] Front-end design by using auditory modeling in speech recognition
    Tian, JL
    Laurila, K
    Hariharan, R
    Kiss, I
    COMPUTATIONAL MODELS OF AUDITORY FUNCTION, 2001, 312 : 329 - 342
  • [46] Assessment of pitch-adaptive front-end signal processing for children's speech recognition
    Sinha, Rohit
    Shahnawazuddin, S.
    COMPUTER SPEECH AND LANGUAGE, 2018, 48 : 103 - 121
  • [47] CNN-Based Audio Front End Processing on Speech Recognition
    Fan, Ruchao
    Liu, Gang
    2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 349 - 354
  • [48] NOISE ADAPTIVE FRONT-END NORMALIZATION BASED ON VECTOR TAYLOR SERIES FOR DEEP NEURAL NETWORKS IN ROBUST SPEECH RECOGNITION
    Bo Li
    Chai, Khe Sim
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7408 - 7412
  • [49] An investigation into front-end signal processing for speaker normalization
    Umesh, S
    Sinha, R
    Kumar, SVB
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 345 - 348
  • [50] Optimization of Speech Enhancement Front-end with Speech Recognition-level Criterion
    Higuchi, Takuya
    Yoshioka, Takuya
    Nakatani, Tomohiro
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3808 - 3812