Composite decision by Bayesian inference in distant-talking speech recognition

被引:0
|
作者
Ji, Mikyong [1 ]
Kim, Sungtak [1 ]
Kim, Hoirin [1 ]
机构
[1] Informat & Commun Univ, SRT Lab, Taejon 305732, South Korea
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes an integrated system to produce a composite recognition output on distant-talking speech when the recognition results from multiple microphone inputs are available. In many cases, the composite recognition result has lower error rate than any other individual output. In this work, the composite recognition result is obtained by applying Bayesian inference. The log likelihood score is assumed. to follow a Gaussian distribution, at least approximately. First, the distribution of the likelihood score is estimated in the development set. Then, the confidence interval for the likelihood score is used to remove unreliable microphone channels. Finally, the area under the distribution between the likelihood score of a hypothesis and that of the (N+1)(st) hypothesis is obtained for every channel and integrated for all channels by Bayesian inference. The proposed system shows considerable performance improvement compared with the result using an ordinary method by the summation of likelihoods as well as any of the recognition results of the channels.
引用
收藏
页码:463 / 470
页数:8
相关论文
共 50 条
  • [41] Simultaneous recognition of distant-talking speech of multiple talkers based on the 3-D N-best search method
    Heracleous, P
    Nakamura, S
    Shikano, K
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2004, 36 (2-3): : 105 - 116
  • [42] Simultaneous Recognition of Distant-Talking Speech of Multiple Talkers Based on the 3-D N-Best Search Method
    Panikos Heracleous
    Satoshi Nakamura
    Kiyohiro Shikano
    Journal of VLSI signal processing systems for signal, image and video technology, 2004, 36 : 105 - 116
  • [43] A prototype of distant-talking interface for control of interactive TV
    Omologo, Maurizio
    2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 1711 - 1715
  • [44] A reverberation robust target speech detection method using dual-microphone in distant-talking scene
    Wang, Xiaofei
    Guo, Yanmeng
    Wu, Chao
    Fu, Qiang
    Yan, Yonghong
    SPEECH COMMUNICATION, 2015, 72 : 47 - 58
  • [45] Simultaneous recognition of distant-talking speech of multiple sound sources based on 3-D N-best search algorithm
    Heracleous, P
    Nakamura, S
    Shikano, K
    ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 111 - 114
  • [46] Effective Acoustic Adaptation for A Distant-talking Interactive TV System
    Huang, Jing
    Epstein, Mark
    Matassoni, Marco
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1709 - +
  • [47] Group Delay Based Methods for Recognition of Distant talking Speech
    Mandala, Rohan
    Shukla, Mrityunjaya
    Hegde, Rajesh
    2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 1702 - 1706
  • [48] Prediction, Bayesian inference and feedback in speech recognition
    Norris, Dennis
    McQueen, James M.
    Cutler, Anne
    LANGUAGE COGNITION AND NEUROSCIENCE, 2016, 31 (01) : 4 - 18
  • [49] Using artificially reverberated training data in distant-talking ASR
    Haderlein, T
    Nöth, E
    Herbordt, W
    Kellermann, W
    Niemann, H
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2005, 3658 : 226 - 233
  • [50] A TWO-MICROPHONE BASED VOICE ACTIVITY DETECTION FOR DISTANT-TALKING SPEECH IN WIDE RANGE OF DIRECTION OF ARRIVAL
    Guo, Yanmeng
    Li, Kai
    Fu, Qiang
    Yan, Yonghong
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4901 - 4904