Composite decision by Bayesian inference in distant-talking speech recognition

被引:0
|
作者
Ji, Mikyong [1 ]
Kim, Sungtak [1 ]
Kim, Hoirin [1 ]
机构
[1] Informat & Commun Univ, SRT Lab, Taejon 305732, South Korea
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes an integrated system to produce a composite recognition output on distant-talking speech when the recognition results from multiple microphone inputs are available. In many cases, the composite recognition result has lower error rate than any other individual output. In this work, the composite recognition result is obtained by applying Bayesian inference. The log likelihood score is assumed. to follow a Gaussian distribution, at least approximately. First, the distribution of the likelihood score is estimated in the development set. Then, the confidence interval for the likelihood score is used to remove unreliable microphone channels. Finally, the area under the distribution between the likelihood score of a hypothesis and that of the (N+1)(st) hypothesis is obtained for every channel and integrated for all channels by Bayesian inference. The proposed system shows considerable performance improvement compared with the result using an ordinary method by the summation of likelihoods as well as any of the recognition results of the channels.
引用
收藏
页码:463 / 470
页数:8
相关论文
共 50 条
  • [21] Investigations into Early and Late Reflections on Distant-Talking Speech Recognition Toward Suitable Reverberation Criteria
    Nishiura, Takanobu
    Hirano, Yoshiki
    Denda, Yuki
    Nakayama, Masato
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1369 - 1372
  • [22] Phase and reverberation aware DNN for distant-talking speech enhancement
    Zeyan Oo
    Longbiao Wang
    Khomdet Phapatanaburi
    Masahiro Iwahashi
    Seiichi Nakagawa
    Jianwu Dang
    Multimedia Tools and Applications, 2018, 77 : 18865 - 18880
  • [23] Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm
    Wang, Longbiao
    Kitaoka, Norihide
    Nakagawa, Seiichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (03): : 659 - 667
  • [24] Reverberation Model-Based Decoding in the Logmelspec Domain for Robust Distant-Talking Speech Recognition
    Sehr, Armin
    Maas, Roland
    Kellermann, Walter
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (07): : 1676 - 1691
  • [25] JOINT SPARSE REPRESENTATION BASED CEPSTRAL-DOMAIN DEREVERBERATION FOR DISTANT-TALKING SPEECH RECOGNITION
    Li, Weifeng
    Wang, Longbiao
    Zhou, Fei
    Liao, Qingmin
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7117 - 7120
  • [26] Distant-talking robust speech recognition using late reflection components of room impulse response
    Gomez, Randy
    Even, Jani
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4581 - 4584
  • [27] Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments - Newest Part of the CENSREC Series -
    Nishiura, Takanobu
    Nakayama, Masato
    Denda, Yuki
    Kitaoka, Norihide
    Yamamoto, Kazumasa
    Yamada, Takeshi
    Tsuge, Satoru
    Miyajima, Chiyomi
    Fujimoto, Masakiyo
    Takiguchi, Tetsuya
    Tamura, Satoshi
    Kuroiwa, Shingo
    Takeda, Kazuya
    Nakamura, Satoshi
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1828 - 1834
  • [28] CENSREC-4: Development of Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments
    Nakayama, Masato
    Nishiura, Takanobu
    Denda, Yuki
    Kitaoka, Norihide
    Yamamoto, Kazumasa
    Yamada, Takeshi
    Tsuge, Satoru
    Miyajima, Chiyomi
    Fujimoto, Masakiyo
    Takiguchi, Tetsuya
    Tamura, Satoshi
    Ogawa, Tetsuji
    Matsuda, Shigeki
    Kuroiwa, Shingo
    Takeda, Kazuya
    Nakamura, Satoshi
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 968 - +
  • [29] Distant-talking speech recognition based on a 3-D Viterbi search using a microphone array
    Yamada, T
    Nakamura, S
    Shikano, K
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (02): : 48 - 56
  • [30] Distant-talking speech recognition with microphone-array sound pickup and NN/MLLR environment equalization
    Lin, QG
    Flanagan, J
    Che, CW
    PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 1099 - 1102