Comparison of Feature Extraction Methods for Speech Recognition in Noise-Free and in Traffic Noise Environment

被引:0
|
作者
Sarosi, Gellert [1 ]
Mozsary, Mihaly [1 ]
Mihajlik, Peter [1 ,2 ]
Fegyo, Tibor [1 ,3 ]
机构
[1] Budapest Univ Technol & Econ, Dept Telecommun & Media Informat, Budapest, Hungary
[2] THINKTech Res Ctr Nonprofit LLC, Budapest, Hungary
[3] Aitia Int Inc, Budapest, Hungary
关键词
feature extraction; multiple languages; multiple sample rates; real-life and white noise; varied SNR;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A crucial part of a speech recognizer is the acoustic feature extraction, especially when the application is intended to be used in noisy environment. In this paper we investigate several novel front-end techniques and compare them to multiple baselines. Recognition tests were performed on studio quality wide band recordings on Hungarian as well as on narrow band telephone speech including real-life noises collected in six languages: English, German, French, Italian, Spanish and Hungarian. The following baseline feature types were used with several settings: Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP) features implemented in HTK, SPHINX, or by ourselves. Novel methods include Perceptual Minimum Variance Distortionless Response (PMVDR) and multiple variations of the Power-Normalized Cepstral Coefficients (PNCC). Also, adaptive techniques are applied to reduce convolutive distortions. We have experienced a significant difference between the MFCC implementations, and there were major differences in the PNCC variations useful in the different bandwidths and noise conditions.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] On the Foundations of Noise-free Selective Classification
    El-Yaniv, Ran
    Wiener, Yair
    JOURNAL OF MACHINE LEARNING RESEARCH, 2010, 11 : 1605 - 1641
  • [32] Quasi noise-free digital holography
    Bianco, Vittorio
    Memmolo, Pasquale
    Paturzo, Melania
    Finizio, Andrea
    Javidi, Bahram
    Ferraro, Pietro
    LIGHT-SCIENCE & APPLICATIONS, 2016, 5 : e16142 - e16142
  • [33] Towards undistorted and noise-free speech in an MRI scanner: Correlation subtraction followed by spectral noise gating (L)
    Inouye, Joshua M.
    Blemker, Silvia S.
    Inouye, David I.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2014, 135 (03): : 1019 - 1022
  • [34] COMPARISON OF SOME NOISE-COMPENSATION METHODS FOR SPEECH RECOGNITION IN ADVERSE ENVIRONMENTS
    MILNER, BP
    VASEGHI, SV
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1994, 141 (05): : 280 - 288
  • [35] DEVICE FOR NOISE-FREE STACKING OF TUBES
    DREIMAN, NI
    KLIMOV, YA
    STEEL IN THE USSR, 1974, 4 (10): : 832 - 833
  • [36] Noise-free aperiodic stochastic multiresonance
    Matyjaskiewicz, S
    Krawiecki, A
    Holyst, JA
    ACTA PHYSICA POLONICA B, 2003, 34 (07): : 3511 - 3522
  • [37] On the foundations of noise-free selective classification
    El-Yaniv, Ran
    Wiener, Yair
    Journal of Machine Learning Research, 2010, 11 : 1605 - 1641
  • [38] Robust speech features extraction in convolutional noise environment
    Lü, Zhao
    Wu, Xiaopei
    Zhang, Chao
    Li, Mi
    Shengxue Xuebao/Acta Acustica, 2010, 35 (04): : 465 - 470
  • [39] Robust automatic speech recognition in impulsive noise environment
    Ding, P
    Cao, ZG
    CHINESE JOURNAL OF ELECTRONICS, 2005, 14 (01): : 165 - 168
  • [40] Comparison of IEC 60270 and RF partial discharge detection in an electromagnetic noise-free environment at differing pressures
    Giussani, R.
    Cotton, I.
    Sloan, R.
    CONFERENCE RECORD OF THE 2012 IEEE INTERNATIONAL SYMPOSIUM ON ELECTRICAL INSULATION (ISEI), 2012, : 127 - 131