Composite decision by Bayesian inference in distant-talking speech recognition

被引：0

作者：

Ji, Mikyong ^{[1
]}

Kim, Sungtak ^{[1
]}

Kim, Hoirin ^{[1
]}

机构：

[1] Informat & Commun Univ, SRT Lab, Taejon 305732, South Korea

来源：

TEXT, SPEECH AND DIALOGUE, PROCEEDINGS | 2006年 / 4188卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes an integrated system to produce a composite recognition output on distant-talking speech when the recognition results from multiple microphone inputs are available. In many cases, the composite recognition result has lower error rate than any other individual output. In this work, the composite recognition result is obtained by applying Bayesian inference. The log likelihood score is assumed. to follow a Gaussian distribution, at least approximately. First, the distribution of the likelihood score is estimated in the development set. Then, the confidence interval for the likelihood score is used to remove unreliable microphone channels. Finally, the area under the distribution between the likelihood score of a hypothesis and that of the (N+1)(st) hypothesis is obtained for every channel and integrated for all channels by Bayesian inference. The proposed system shows considerable performance improvement compared with the result using an ordinary method by the summation of likelihoods as well as any of the recognition results of the channels.

引用

页码：463 / 470

页数：8

共 50 条

[21] Investigations into Early and Late Reflections on Distant-Talking Speech Recognition Toward Suitable Reverberation Criteria
Nishiura, Takanobu
Hirano, Yoshiki
Denda, Yuki
Nakayama, Masato
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1369 - 1372
[22] Phase and reverberation aware DNN for distant-talking speech enhancement
Zeyan Oo
Longbiao Wang
Khomdet Phapatanaburi
Masahiro Iwahashi
Seiichi Nakagawa
Jianwu Dang
Multimedia Tools and Applications, 2018, 77 : 18865 - 18880
[23] Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm
Wang, Longbiao
Kitaoka, Norihide
Nakagawa, Seiichi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (03): : 659 - 667
[24] Reverberation Model-Based Decoding in the Logmelspec Domain for Robust Distant-Talking Speech Recognition
Sehr, Armin
Maas, Roland
Kellermann, Walter
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (07): : 1676 - 1691
[25] JOINT SPARSE REPRESENTATION BASED CEPSTRAL-DOMAIN DEREVERBERATION FOR DISTANT-TALKING SPEECH RECOGNITION
Li, Weifeng
Wang, Longbiao
Zhou, Fei
Liao, Qingmin
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7117 - 7120
[26] Distant-talking robust speech recognition using late reflection components of room impulse response
Gomez, Randy
Even, Jani
Saruwatari, Hiroshi
Shikano, Kiyohiro
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4581 - 4584
[27] Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments - Newest Part of the CENSREC Series -
Nishiura, Takanobu
Nakayama, Masato
Denda, Yuki
Kitaoka, Norihide
Yamamoto, Kazumasa
Yamada, Takeshi
Tsuge, Satoru
Miyajima, Chiyomi
Fujimoto, Masakiyo
Takiguchi, Tetsuya
Tamura, Satoshi
Kuroiwa, Shingo
Takeda, Kazuya
Nakamura, Satoshi
SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1828 - 1834
[28] CENSREC-4: Development of Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments
Nakayama, Masato
Nishiura, Takanobu
Denda, Yuki
Kitaoka, Norihide
Yamamoto, Kazumasa
Yamada, Takeshi
Tsuge, Satoru
Miyajima, Chiyomi
Fujimoto, Masakiyo
Takiguchi, Tetsuya
Tamura, Satoshi
Ogawa, Tetsuji
Matsuda, Shigeki
Kuroiwa, Shingo
Takeda, Kazuya
Nakamura, Satoshi
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 968 - +
[29] Distant-talking speech recognition based on a 3-D Viterbi search using a microphone array
Yamada, T
Nakamura, S
Shikano, K
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (02): : 48 - 56
[30] Distant-talking speech recognition with microphone-array sound pickup and NN/MLLR environment equalization
Lin, QG
Flanagan, J
Che, CW
PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 1099 - 1102

← 1 2 3 4 5 →