Time-regularized linear prediction for noise-robust extraction of the spectral envelope of speech

被引:5
|
作者
Airaksinen, Manu [1 ]
Juvela, Lauri [1 ]
Rasanen, Okka [1 ]
Alku, Paavo [1 ]
机构
[1] Aalto Univ, Espoo, Finland
基金
芬兰科学院;
关键词
speech analysis; linear prediction; robust features;
D O I
10.21437/Interspeech.2018-1230
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature extraction of speech signals is typically performed in short-time frames by assuming that the signal is stationary within each frame. For the extraction of the spectral envelope of speech, which conveys the formant frequencies produced by the resonances of the slowly varying vocal tract, an often used frame length is within 20-30 ms. However, this kind of conventional frame-based spectral analysis is oblivious of the broader temporal context of the signal and is prone to degradation by, for example, environmental noise. In this paper, we propose a new frame-based linear prediction (LP) analysis method that includes a regularization term that penalizes energy differences in consecutive frames of an all-pole spectral envelope model. This integrates the slowly varying nature of the vocal tract as a part of the analysis. Objective evaluations related to feature distortion and phonetic representational capability were performed by studying the properties of the mel-frequency cepstral coefficient (MFCC) representations computed from different spectral estimation methods under noisy conditions using the TIMIT database. The results show that the proposed time-regularized LP approach exhibits superior MFCC distortion behavior while simultaneously having the greatest average separability of different phoneme categories in comparison to the other methods.
引用
收藏
页码:701 / 705
页数:5
相关论文
共 50 条
  • [31] FLEXIBLE MULTICHANNEL SPEECH ENHANCEMENT FOR NOISE-ROBUST FRONTEND
    Jukic, Ante
    Balam, Jagadeesh
    Ginsburg, Boris
    2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
  • [32] Factorial Speech Processing Models for Noise-Robust Automatic Speech Recognition
    Khademian, Mahdi
    Homayounpour, Mohammad Mehdi
    2015 23RD IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2015, : 637 - 642
  • [33] Speech Enhancement for Noise-Robust Speech Synthesis using Wasserstein GAN
    Adiga, Nagaraj
    Pantazis, Yannis
    Tsiaras, Vassilis
    Stylianou, Yannis
    INTERSPEECH 2019, 2019, : 1821 - 1825
  • [34] Spectral compensation for linear prediction of speech signals in coloured noise
    Hu, HT
    ELECTRONICS LETTERS, 1998, 34 (11) : 1080 - 1081
  • [35] Photo Semantic Understanding and Retargeting by a Noise-Robust Regularized Topic Model
    Wang, Guifeng
    Zhang, Luming
    Li, Yongbin
    Sheng, Yichuan
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 3495 - 3505
  • [36] Employing Robust Principal Component Analysis for Noise-Robust Speech Feature Extraction in Automatic Speech Recognition with the Structure of a Deep Neural Network
    Hung, Jeih-weih
    Lin, Jung-Shan
    Wu, Po-Jen
    APPLIED SYSTEM INNOVATION, 2018, 1 (03) : 1 - 14
  • [37] Noise-Robust Feature Extraction Based on Forward Masking
    Chiou, Sheng-Chiuan
    Chen, Chia-Ping
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1243 - 1246
  • [38] Noise-Robust Conformal Prediction for Medical Image Classification
    Penso, Coby
    Goldberger, Jacob
    MACHINE LEARNING IN MEDICAL IMAGING, PT II, MLMI 2024, 2025, 15242 : 159 - 168
  • [39] Noise-robust speech feature processing with empirical mode decomposition
    Kuo-Hau Wu
    Chia-Ping Chen
    Bing-Feng Yeh
    EURASIP Journal on Audio, Speech, and Music Processing, 2011
  • [40] Noise-robust speech recognition based on difference of power spectrum
    Xu, JF
    Wei, G
    ELECTRONICS LETTERS, 2000, 36 (14) : 1247 - 1248