Mask estimation for missing data speech recognition based on statistics of binaural interaction

被引:38
|
作者
Harding, S [1 ]
Barker, J [1 ]
Brown, GJ [1 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
关键词
automatic speech recognition; binaural; computational auditory scene analysis (CASA); interaural level differences (ILD); interaural time differences (ITD); missing data; reverberation;
D O I
10.1109/TSA.2005.860354
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes a perceptually motivated computational auditory scene analysis (CASA) system that combines sound separation according to spatial location with the "missing data" approach for robust speech recognition in noise. Missing data time-frequency masks are created using probability distributions based on estimates of interaural time and level differences (ITD and ILD) for mixed utterances in reverberated conditions; these masks indicate which regions of the spectrum constitute reliable evidence of the target speech signal. A number of experiments compare the relative efficacy of the binaural cues when used individually and in combination. We also investigate the ability of the system to generalize to acoustic conditions not encountered during training. Performance on a continuous digit recognition task using this method is found to be good, even in a particularly challenging environment with three concurrent male talkers.
引用
收藏
页码:58 / 67
页数:10
相关论文
共 50 条
  • [1] Zero-crossing based binaural mask estimation for missing data speech recognition
    Kim, Young-Ik
    An, Sung Jun
    Kil, Rhee Man
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 4947 - 4950
  • [2] Mask estimation based on sound localisation for missing data speech recognition
    Harding, S
    Barker, J
    Brown, GJ
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 537 - 540
  • [3] Vector-Quantization based Mask Estimation for Missing Data Automatic Speech Recognition
    Van Segbroeck, Maarten
    Van Hamme, Hugo
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1825 - 1828
  • [4] Mask estimation and imputation methods for missing data speech recognition in a multisource reverberant environment
    Keronen, Sami
    Kallasjoki, Heikki
    Remes, Ulpu
    Brown, Guy J.
    Gemmeke, Jort F.
    Palomaki, Kalle J.
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (03): : 798 - 819
  • [5] A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition
    Seltzer, ML
    Raj, B
    Stern, RM
    SPEECH COMMUNICATION, 2004, 43 (04) : 379 - 393
  • [6] A PITCH BASED NOISE ESTIMATION TECHNIQUE FOR ROBUST SPEECH RECOGNITION WITH MISSING DATA
    Morales-Cordovilla, Juan A.
    Ma, Ning
    Sanchez, Victoria
    Carmona, Jose L.
    Peinado, Antonio M.
    Barker, Jon
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4808 - 4811
  • [7] Mask Estimation in Non-stationary Noise Environments for Missing Feature Based Robust Speech Recognition
    Badiezadegan, Shirin
    Rose, Richard C.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2062 - 2065
  • [8] Mask estimation for missing data recognition using background noise sniffing
    Demange, Sebastien
    Cerisara, Christophe
    Haton, Jean-Paul
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 301 - 304
  • [9] A binaural processor for missing data speech recognition in the presence of noise and small-room reverberation
    Palomäki, KJ
    Brown, GJ
    Wang, DL
    SPEECH COMMUNICATION, 2004, 43 (04) : 361 - 378
  • [10] Robust emotional speech recognition based on binaural model and emotional auditory mask in noisy environments
    Bashirpour, Meysam
    Geravanchizadeh, Masoud
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2018,