Development of the CUHK Dysarthric Speech Recognition System for the UASpeech Corpus

被引：0

作者：

Yu, Jianwei ^{[1
]}

Xie, Xurong ^{[2
]}

Liu, Shansong ^{[1
]}

Hu, Shoukang ^{[1
]}

Lam, Max W. Y. ^{[1
]}

Wu, Xixin ^{[1
]}

Wong, Ka Ho ^{[1
]}

Liu, Xunying ^{[1
]}

Meng, Helen ^{[1
]}

机构：

[1] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China

[2] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

关键词：

dysarthric speech; speech recognition; cross domain adaptation; system combination; auto-encoder;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dysarthric speech recognition is a highly challenging task. The articulatory motor control problems associated with neuromotor conditions produce large mismatch against normal speech, In addition, such data is difficult to collect in large quantities. This paper presents the development of the Chinese University of Hong Kong automatic speech recognition (ASR) system for the Universal Access Speech (UASpeech) [1]. A range of deep neural network (DNN) acoustic models and their more advanced variants based on time delayed neural networks (TDNNs) and long short-term memory recurrent neural networks (LSTM-RNNs) were developed. Speaker adaptation by learning hidden unit contributions (LHUC) was used. A semi-supervised complementary auto-encoder system was further constructed to improve the bottleneck feature extraction. Two out-of-domain (OOD) ASR systems separately trained on broadcast news and switchboard data were cross domain adapted towards the UASpeech data and adopted in system combination. The final combined system gave an overall word accuracy of 69.4% on the 16-speaker test set.

引用

页码：2938 / 2942

页数：5

共 50 条

[1] Recent Progress in the CUHK Dysarthric Speech Recognition System
Liu, Shansong
Geng, Mengzhe
Hu, Shoukang
Xie, Xurong
Cui, Mingyu
Yu, Jianwei
Liu, Xunying
Meng, Helen
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2267 - 2281
[2] The CUHK Dysarthric Speech Recognition Systems for English and Cantonese
Hu, Shoukang
Liu, Shansong
Chang, Heng Fai
Geng, Mengzhe
Chen, Jiani
Chung, Lau Wing
Hei, To Ka
Yu, Jianwei
Wong, Ka Ho
Liu, Xunying
Meng, Helen
INTERSPEECH 2019, 2019, : 3669 - 3670
[3] DEVELOPMENT OF THE CUHK ELDERLY SPEECH RECOGNITION SYSTEM FOR NEUROCOGNITIVE DISORDER DETECTION USING THE DEMENTIABANK CORPUS
Ye, Zi
Hu, Shoukang
Li, Jinchao
Xie, Xurong
Geng, Mengzhe
Yu, Jianwei
Xu, Junhao
Xue, Boyang
Liu, Shansong
Liu, Xunying
Meng, Helen
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6433 - 6437
[4] Development of a Cantonese Dysarthric Speech Corpus
Wong, Ka Ho
Yeung, Yu Ting
Chan, Edwin H. Y.
Wong, Patrick C. M.
Levow, Gina-Anne
Meng, Helen
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 329 - 333
[5] Dysarthric Speech Transformer: A Sequence-to-Sequence Dysarthric Speech Recognition System
Shahamiri, Seyed Reza
Lal, Vanshika
Shah, Dhvani
IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2023, 31 : 3407 - 3416
[6] Interface of an Automatic Recognition System for Dysarthric Speech
Zaidi, Brahim-Fares
Boudraa, Malika
Selouani, Sid-Ahmed
Addou, Djamel
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (09) : 560 - 564
[7] EasyCall corpus: a dysarthric speech dataset
Turrisi, Rosanna
Braccia, Arianna
Emanuele, Marco
Giulietti, Simone
Pugliatti, Maura
Sensi, Mariachiara
Fadiga, Luciano
Badino, Leonardo
INTERSPEECH 2021, 2021, : 41 - 45
[8] Optimization of dysarthric speech recognition
Chen, FX
Kostov, A
PROCEEDINGS OF THE 19TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOL 19, PTS 1-6: MAGNIFICENT MILESTONES AND EMERGING OPPORTUNITIES IN MEDICAL ENGINEERING, 1997, 19 : 1436 - 1439
[9] Development of Text and Speech Corpus for Designing the Multilingual Recognition System
Bansal, Shweta
Agrawal, Shyam S.
2018 ORIENTAL COCOSDA - INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2018, : 1 - 7
[10] A Survey of Automatic Speech Recognition for Dysarthric Speech
Qian, Zhaopeng
Xiao, Kejing
ELECTRONICS, 2023, 12 (20)

← 1 2 3 4 5 →