DETECTION, SEPARATION AND RECOGNITION OF SPEECH FROM CONTINUOUS SIGNALS USING SPECTRAL FACTORISATION

被引:0
|
作者
Hurmalainen, Antti [1 ]
Gemmeke, Jort F. [2 ]
Virtanen, Tuomas [1 ]
机构
[1] Tampere Univ Technol, POB 553, FI-33101 Tampere, Finland
[2] Katholieke Univ Leuven, Dept ESAT PSI, B-3001 Heverlee, Belgium
关键词
Spectral factorization; speech recognition; speaker recognition; voice activity detection; speech separation;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In real world speech processing, the signals are often continuous and consist of momentary segments of speech over non-stationary background noise. It has been demonstrated that spectral factorisation using multi-frame atoms can be successfully employed to separate and recognise speech in adverse conditions. While in previous work full knowledge of utterance endpointing and speaker identity was used for noise modelling and speech recognition, this study proposes spectral factorisation and sparse classification techniques to detect, identify, separate and recognise speech from a continuous noisy input. Speech models are trained beforehand, but noise models are acquired adaptively from the input by using voice activity detection without prior knowledge of noise-only locations. The results are evaluated on the CHiME corpus, containing utterances from 34 speakers over highly non-stationary multi-source noise.
引用
收藏
页码:2649 / 2653
页数:5
相关论文
共 50 条
  • [1] Emotion Recognition from Speech Signals using Excitation Source and Spectral Features
    Choudhury, Akash Roy
    Ghosh, Anik
    Pandey, Rahul
    Barman, Subhas
    PROCEEDINGS OF 2018 IEEE APPLIED SIGNAL PROCESSING CONFERENCE (ASPCON), 2018, : 257 - 261
  • [2] Modelling non-stationary noise with spectral factorisation in automatic speech recognition
    Hurmalainen, Antti
    Gemmeke, Jort F.
    Virtanen, Tuomas
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (03): : 763 - 779
  • [3] Detection of the common cold from speech signals using transformer model and spectral features
    Warule, Pankaj
    Chandratre, Snigdha
    Mishra, Siba Prasad
    Deb, Suman
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 93
  • [4] RECOGNITION OF SPEECH FROM SIGNALS SECONDARY TO SPEECH
    HARTZOG, S
    MORSE, MS
    TRULL, B
    ALEGRE, C
    HARRIS, P
    PROCEEDINGS OF THE ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, PTS 1-4, 1988, : 1188 - 1189
  • [5] Spectral Analysis of EEG Signals for Automatic Imagined Speech Recognition
    Kamble, Ashwin
    Ghare, Pradnya H.
    Kumar, Vinay
    Kothari, Ashwin
    Keskar, Avinash G.
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [6] Using Automatic Speech Recognition to Measure the Intelligibility of Speech Synthesized from Brain Signals
    Varshney, Suvi
    Farias, Dana
    Brandman, David M.
    Stavisky, Sergey D.
    Miller, Lee M.
    2023 11TH INTERNATIONAL IEEE/EMBS CONFERENCE ON NEURAL ENGINEERING, NER, 2023,
  • [7] Robust speech separation using visually constructed speech signals
    Aarabi, P
    Khameneh, NH
    SENSOR FUSION: ARCHITECTURES, ALGORITHMS, AND APPLICATIONS VI, 2002, 4731 : 239 - 247
  • [8] Separation of Speech Signals Using Trigonometric Transforms
    Hammam, Hossam
    Abu El-Azm, Atef E.
    Elhalawany, Mohamed E.
    2009 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES 2009), 2009, : 197 - +
  • [9] Speech recognition from spectral dynamics
    HYNEK HERMANSKY
    Sadhana, 2011, 36 : 729 - 744
  • [10] Speech recognition from spectral dynamics
    Hermansky, Hynek
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2011, 36 (05): : 729 - 744