Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments - Newest Part of the CENSREC Series -

被引：0

作者：

Nishiura, Takanobu

Nakayama, Masato

Denda, Yuki

Kitaoka, Norihide

Yamamoto, Kazumasa

Yamada, Takeshi

Tsuge, Satoru

Miyajima, Chiyomi

Fujimoto, Masakiyo

Takiguchi, Tetsuya

Tamura, Satoshi

Kuroiwa, Shingo

Takeda, Kazuya

Nakamura, Satoshi

机构：

来源：

SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008 | 2008年

关键词：

D O I：

暂无

中图分类号：

H0 [语言学];

学科分类号：

030303 ; 0501 ; 050102 ;

摘要：

Recently, speech recognition performance has been drastically improved by statistical methods and huge speech databases. Now performance improvement under such realistic environments as noisy conditions is being focused on. Since October 2001, we from the working group of the Information Processing Society in Japan have been working on evaluation methodologies and frameworks for Japanese noisy speech recognition. We have released frameworks including databases and evaluation tools called CENSREC-1 (Corpus and Environment for Noisy Speech RECognition 1; formerly AURORA-2J), CENSREC-2 (in-car connected digits recognition), CENSREC-3 (in-car isolated word recognition), and CENSREC-1-C (voice activity detection under noisy conditions). In this paper, we newly introduce a collection of databases and evaluation tools named CENSREC-4, which is an evaluation framework for distant-talking speech under hands-free conditions. Distant-talking speech recognition is crucial for a hands-free speech interface. Therefore, we measured room impulse responses to investigate reverberant speech recognition. The results of evaluation experiments proved that CENSREC-4 is an effective database suitable for evaluating the new dereverberation method because the traditional dereverberation process had difficulty sufficiently improving the recognition performance. The framework was released in March 2008, and many studies are being conducted with it in Japan.

引用

页码：1828 / 1834

页数：7

共 47 条

[31] Distant-talking speech recognition with microphone-array sound pickup and NN/MLLR environment equalization
Lin, QG
Flanagan, J
Che, CW
PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 1099 - 1102
[32] 3-D N-best search for simultaneous recognition of distant-talking speech of multiple talkers
Nakamura, S
Heracleous, P
FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, : 59 - 63
[33] Single-channel dereverberation for distant-talking speech recognition by combining denoising autoencoder and temporal structure normalization
Ueda, Yuma
Wang, Longbiao
Kai, Atsuhiko
Xiao, Xiong
Chng, Eng Siong
Li, Haizhou
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 379 - +
[34] CENSREC2: Corpus and Evaluation Environments for In Car Continuous Digit Speech Recognition
Nakamura, Satoshi
Fujimoto, Masakiyo
Takeda, Kazuya
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2330 - +
[35] Distant-talking speech recognition using multi-channel LMS and multiple-step linear prediction
Shiota, Satoshi
Wang, Longbiao
Odani, Kyohei
Kai, Atsuhiko
Li, Weifeng
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 384 - +
[36] Minimum Kullback-Leibler distance based multivariate Gaussian feature adaptation for distant-talking speech recognition
Pan, Y
Waibel, A
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 1029 - 1032
[37] Single-channel Dereverberation for Distant-Talking Speech Recognition by Combining Denoising Autoencoder and Temporal Structure Normalization
Yuma Ueda
Longbiao Wang
Atsuhiko Kai
Xiong Xiao
Eng Siong Chng
Haizhou Li
Journal of Signal Processing Systems, 2016, 82 : 151 - 161
[38] Single-channel Dereverberation for Distant-Talking Speech Recognition by Combining Denoising Autoencoder and Temporal Structure Normalization
Ueda, Yuma
Wang, Longbiao
Kai, Atsuhiko
Xiao, Xiong
Chng, Eng Siong
Li, Haizhou
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2016, 82 (02): : 151 - 161
[39] THE REVERB CHALLENGE: A COMMON EVALUATION FRAMEWORK FOR DEREVERBERATION AND RECOGNITION OF REVERBERANT SPEECH
Kinoshita, Keisuke
Delcroix, Marc
Yoshioka, Takuya
Nakatani, Tomohiro
Habets, Emanuel
Haeb-Umbach, Reinhold
Leutnant, Volker
Sehr, Armin
Kellermann, Walter
Maas, Roland
Gannot, Sharon
Raj, Bhiksha
2013 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2013,
[40] Distant-talking Speech Recognition Based on Multi-objective Learning using Phase and Magnitude-based Feature
Li, Dongbo
Wang, Longbiao
Dang, Jianwu
Ge, Meng
Guan, Haotian
2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 394 - 398

← 1 2 3 4 5 →