Time-regularized linear prediction for noise-robust extraction of the spectral envelope of speech

被引:5
|
作者
Airaksinen, Manu [1 ]
Juvela, Lauri [1 ]
Rasanen, Okka [1 ]
Alku, Paavo [1 ]
机构
[1] Aalto Univ, Espoo, Finland
基金
芬兰科学院;
关键词
speech analysis; linear prediction; robust features;
D O I
10.21437/Interspeech.2018-1230
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature extraction of speech signals is typically performed in short-time frames by assuming that the signal is stationary within each frame. For the extraction of the spectral envelope of speech, which conveys the formant frequencies produced by the resonances of the slowly varying vocal tract, an often used frame length is within 20-30 ms. However, this kind of conventional frame-based spectral analysis is oblivious of the broader temporal context of the signal and is prone to degradation by, for example, environmental noise. In this paper, we propose a new frame-based linear prediction (LP) analysis method that includes a regularization term that penalizes energy differences in consecutive frames of an all-pole spectral envelope model. This integrates the slowly varying nature of the vocal tract as a part of the analysis. Objective evaluations related to feature distortion and phonetic representational capability were performed by studying the properties of the mel-frequency cepstral coefficient (MFCC) representations computed from different spectral estimation methods under noisy conditions using the TIMIT database. The results show that the proposed time-regularized LP approach exhibits superior MFCC distortion behavior while simultaneously having the greatest average separability of different phoneme categories in comparison to the other methods.
引用
收藏
页码:701 / 705
页数:5
相关论文
共 50 条
  • [41] Noise-robust speech analysis using running spectrum filtering
    Zhu, Q
    Ohtsuki, N
    Miyanaga, Y
    Yoshida, N
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2005, E88A (02) : 541 - 548
  • [42] Spectral-Spatial Preprocessing Using Multihypothesis Prediction for Noise-Robust Hyperspectral Image Classification
    Chen, Chen
    Li, Wei
    Tramel, Eric W.
    Cui, Minshan
    Prasad, Saurabh
    Fowler, James E.
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2014, 7 (04) : 1047 - 1059
  • [43] A speech emphasis method for noise-robust speech recognition by using repetitive phrase
    Hirai, Takanori
    Kuroiwa, Shingo
    Tsuge, Satoru
    Ren, Fuji
    Fattah, Mohamed Abdel
    2006 10TH INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY, VOLS 1 AND 2, PROCEEDINGS, 2006, : 1269 - +
  • [44] Spectral network based on lattice convolution and adversarial training for noise-robust speech super-resolution
    Yang, Junkang
    Liu, Hongqing
    Gan, Lu
    Jing, Xiaorong
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2024, 156 (05): : 3143 - 3157
  • [45] On the temporal decorrelation of feature parameters for noise-robust speech recognition
    Jung, HY
    Lee, SY
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (04): : 407 - 416
  • [46] Deep Maxout Networks Applied to Noise-Robust Speech Recognition
    de-la-Calle-Silos, F.
    Gallardo-Antolin, A.
    Pelaez-Moreno, C.
    ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2014, 2014, 8854 : 109 - 118
  • [47] MULTI-TASK AUTOENCODER FOR NOISE-ROBUST SPEECH RECOGNITION
    Zhang, Haoyi
    Liu, Conggui
    Inoue, Nakamasa
    Shinoda, Koichi
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5599 - 5603
  • [48] An Efficient and Noise-Robust Audiovisual Encoder for Audiovisual Speech Recognition
    Li, Zhengyang
    Liang, Chenwei
    Lohrenz, Timo
    Sach, Marvin
    Moeller, Bjoern
    Fingscheidt, Tim
    INTERSPEECH 2023, 2023, : 1583 - 1587
  • [49] Empirical Mode Decomposition For Noise-Robust Automatic Speech Recognition
    Wu, Kuo-Hao
    Chen, Chia-Ping
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2074 - 2077
  • [50] Noise-Robust Speech Recognition Based on RBF Neural Network
    Hou, Xuemei
    HIGH PERFORMANCE STRUCTURES AND MATERIALS ENGINEERING, PTS 1 AND 2, 2011, 217-218 : 413 - 418