DETECTION, SEPARATION AND RECOGNITION OF SPEECH FROM CONTINUOUS SIGNALS USING SPECTRAL FACTORISATION

被引:0
|
作者
Hurmalainen, Antti [1 ]
Gemmeke, Jort F. [2 ]
Virtanen, Tuomas [1 ]
机构
[1] Tampere Univ Technol, POB 553, FI-33101 Tampere, Finland
[2] Katholieke Univ Leuven, Dept ESAT PSI, B-3001 Heverlee, Belgium
来源
2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2012年
关键词
Spectral factorization; speech recognition; speaker recognition; voice activity detection; speech separation;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In real world speech processing, the signals are often continuous and consist of momentary segments of speech over non-stationary background noise. It has been demonstrated that spectral factorisation using multi-frame atoms can be successfully employed to separate and recognise speech in adverse conditions. While in previous work full knowledge of utterance endpointing and speaker identity was used for noise modelling and speech recognition, this study proposes spectral factorisation and sparse classification techniques to detect, identify, separate and recognise speech from a continuous noisy input. Speech models are trained beforehand, but noise models are acquired adaptively from the input by using voice activity detection without prior knowledge of noise-only locations. The results are evaluated on the CHiME corpus, containing utterances from 34 speakers over highly non-stationary multi-source noise.
引用
收藏
页码:2649 / 2653
页数:5
相关论文
共 50 条
  • [41] Improving Automatic Emotion Recognition from Speech Signals
    Bozkurt, Elif
    Erzin, Engin
    Erdem, Cigdem Eroglu
    Erdem, A. Tanju
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 312 - +
  • [42] Continuous speech recognition without end-point detection
    Segawa, Osamu
    Takeda, Kazuya
    Itakura, Fumitada
    ELECTRICAL ENGINEERING IN JAPAN, 2006, 156 (04) : 43 - 50
  • [43] Continuous speech recognition without end-point detection
    Segawa, O
    Takeda, K
    Itakura, F
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 245 - 248
  • [44] Continuous speech recognition without end-point detection
    Segawa, Osamu
    Takeda, Kazuya
    Itakura, Fumitada
    Electrical Engineering in Japan (English translation of Denki Gakkai Ronbunshi), 2006, 156 (04): : 43 - 50
  • [45] Blind separation of speech from aortic regurgitation signals using Dhoulath's method
    Beegum, J. Dhoulath
    Chithraprasad, D.
    Ampadi, Sajeev
    Poonia, Ramesh Chandra
    Sankar, Mahesh
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2024, 36 (03) : 371 - 388
  • [46] APPLICATION OF CONTINUOUS DYNAMIC PROGRAMMING FOR CONTINUOUS SPEECH RECOGNITION BY USING COMPLEMENTARY SPEECH PARAMETERS.
    Oka, Ryu-ichi
    Suzuki, Torazo
    Bulletin of the Electrotechnical Laboratory, Tokyo, 1984, 48 (1-2): : 51 - 54
  • [47] Spotting and Recognition of Consonant-Vowel Units from Continuous Speech Using Accurate Detection of Vowel Onset Points
    Vuppala, Anil Kumar
    Rao, K. Sreenivasa
    Chakrabarti, Saswat
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2012, 31 (04) : 1459 - 1474
  • [48] Spotting and Recognition of Consonant-Vowel Units from Continuous Speech Using Accurate Detection of Vowel Onset Points
    Anil Kumar Vuppala
    K. Sreenivasa Rao
    Saswat Chakrabarti
    Circuits, Systems, and Signal Processing, 2012, 31 : 1459 - 1474
  • [49] Detection and Analysis of Emotion From Speech Signals
    Davletcharova, Assel
    Sugathan, Sherin
    Abraham, Bibia
    James, Alex Pappachen
    SECOND INTERNATIONAL SYMPOSIUM ON COMPUTER VISION AND THE INTERNET (VISIONNET'15), 2015, 58 : 91 - 96
  • [50] Spectral analysis of speech signals using chirp group delay
    Bozkurt, Baris
    Dutoit, Thierry
    Couvreur, Laurent
    PROGRESS IN NONLINEAR SPEECH PROCESSING, 2007, 4391 : 41 - +