Development of the CUHK Dysarthric Speech Recognition System for the UASpeech Corpus

被引：0

作者：

Yu, Jianwei ^{[1
]}

Xie, Xurong ^{[2
]}

Liu, Shansong ^{[1
]}

Hu, Shoukang ^{[1
]}

Lam, Max W. Y. ^{[1
]}

Wu, Xixin ^{[1
]}

Wong, Ka Ho ^{[1
]}

Liu, Xunying ^{[1
]}

Meng, Helen ^{[1
]}

机构：

[1] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China

[2] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

关键词：

dysarthric speech; speech recognition; cross domain adaptation; system combination; auto-encoder;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dysarthric speech recognition is a highly challenging task. The articulatory motor control problems associated with neuromotor conditions produce large mismatch against normal speech, In addition, such data is difficult to collect in large quantities. This paper presents the development of the Chinese University of Hong Kong automatic speech recognition (ASR) system for the Universal Access Speech (UASpeech) [1]. A range of deep neural network (DNN) acoustic models and their more advanced variants based on time delayed neural networks (TDNNs) and long short-term memory recurrent neural networks (LSTM-RNNs) were developed. Speaker adaptation by learning hidden unit contributions (LHUC) was used. A semi-supervised complementary auto-encoder system was further constructed to improve the bottleneck feature extraction. Two out-of-domain (OOD) ASR systems separately trained on broadcast news and switchboard data were cross domain adapted towards the UASpeech data and adopted in system combination. The final combined system gave an overall word accuracy of 69.4% on the 16-speaker test set.

引用

页码：2938 / 2942

页数：5

共 50 条

[31] Data Augmentation using Healthy Speech for Dysarthric Speech Recognition
Vachhani, Bhavik
Bhat, Chitralekha
Kopparapu, Sunil Kumar
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 471 - 475
[32] Towards the Improvement of Automatic Recognition of Dysarthric Speech
Tolba, Hesham
EL Torgoman, Ahmed S.
2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 1, 2009, : 277 - +
[33] Using speech rhythm knowledge to improve dysarthric speech recognition
S.-A. Selouani
H. Dahmani
R. Amami
H. Hamam
International Journal of Speech Technology, 2012, 15 (1) : 57 - 64
[34] Dysarthric speech: A comparison of computerized speech recognition and listener intelligibility
Doyle, PC
Leeper, HA
Kotler, AL
ThomasStonell, N
ONeill, C
Dylke, MC
Rolls, K
JOURNAL OF REHABILITATION RESEARCH AND DEVELOPMENT, 1997, 34 (03): : 309 - 316
[35] Speech Technology for Automatic Recognition and Assessment of Dysarthric Speech: An Overview
Bhat, Chitralekha
Strik, Helmer
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2025, 68 (02): : 547 - 577
[36] A Framework for Collecting Realistic Recordings of Dysarthric Speech - the homeService Corpus
Nicolao, Mauro
Christensen, Heidi
Cunningham, Stuart
Green, Phil
Hain, Thomas
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 1993 - 1997
[37] Accuracy of three speech recognition systems: Case study of dysarthric speech
Hux, Karen
Rankin-Erickson, Joan
Manasse, Nancy
Lauritzen, Elizabeth
2000, Decker Periodicals Publishing, Inc., Hamilton, Canada (16):
[38] Comparing Humans and Automatic Speech Recognition Systems in Recognizing Dysarthric Speech
Mengistu, Kinfe Tadesse
Rudzicz, Frank
ADVANCES IN ARTIFICIAL INTELLIGENCE, 2011, 6657 : 291 - 300
[39] DYSARTHRIC SPEECH RECOGNITION WITH LATTICE-FREE MMI
Hermann, Enno
Magimai-Doss, Mathew
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6109 - 6113
[40] Dysarthric speakers' intelligibility and speech characteristics in relation to computer speech recognition
AAC Augmentative Altern Commun, 3 (165):

← 1 2 3 4 5 →