Development of the CUHK Dysarthric Speech Recognition System for the UASpeech Corpus

被引：0

作者：

Yu, Jianwei ^{[1
]}

Xie, Xurong ^{[2
]}

Liu, Shansong ^{[1
]}

Hu, Shoukang ^{[1
]}

Lam, Max W. Y. ^{[1
]}

Wu, Xixin ^{[1
]}

Wong, Ka Ho ^{[1
]}

Liu, Xunying ^{[1
]}

Meng, Helen ^{[1
]}

机构：

[1] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China

[2] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

关键词：

dysarthric speech; speech recognition; cross domain adaptation; system combination; auto-encoder;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dysarthric speech recognition is a highly challenging task. The articulatory motor control problems associated with neuromotor conditions produce large mismatch against normal speech, In addition, such data is difficult to collect in large quantities. This paper presents the development of the Chinese University of Hong Kong automatic speech recognition (ASR) system for the Universal Access Speech (UASpeech) [1]. A range of deep neural network (DNN) acoustic models and their more advanced variants based on time delayed neural networks (TDNNs) and long short-term memory recurrent neural networks (LSTM-RNNs) were developed. Speaker adaptation by learning hidden unit contributions (LHUC) was used. A semi-supervised complementary auto-encoder system was further constructed to improve the bottleneck feature extraction. Two out-of-domain (OOD) ASR systems separately trained on broadcast news and switchboard data were cross domain adapted towards the UASpeech data and adopted in system combination. The final combined system gave an overall word accuracy of 69.4% on the 16-speaker test set.

引用

页码：2938 / 2942

页数：5

共 50 条

[41] Deep Autoencoder based Speech Features for Improved Dysarthric Speech Recognition
Vachhani, Bhavik
Bhat, Chitralekha
Das, Biswajit
Kopparapu, Sunil Kumar
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1854 - 1858
[42] RAW SOURCE AND FILTER MODELLING FOR DYSARTHRIC SPEECH RECOGNITION
Yue, Zhengjun
Loweimi, Erfan
Cvetkovic, Zoran
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7377 - 7381
[43] Domain Adversarial Neural Networks for Dysarthric Speech Recognition
Woszczyk, Dominika
Petridis, Stavros
Millard, David
INTERSPEECH 2020, 2020, : 3875 - 3879
[44] Improved Acoustic Modeling for Automatic Dysarthric Speech Recognition
Sriranjani, R.
Reddy, M. Ramasubba
Umesh, S.
2015 TWENTY FIRST NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2015,
[45] Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers
Santiago Omar Caballero Morales
Stephen J. Cox
EURASIP Journal on Advances in Signal Processing, 2009
[46] Integration of metamodel and acoustic model for dysarthric speech recognition
Matsumasa, Hironori
Takiguchi, Tetsuya
Ariki, Yasuo
Li, I-Chao
Nakabayashi, Toshitaka
Journal of Multimedia, 2009, 4 (04): : 254 - 261
[47] Comparison of Noise Reduction Techniques for Dysarthric Speech Recognition
Mulfari, Davide
Campobello, Giuseppe
Gugliandolo, Giovanni
Celesti, Antonio
Villari, Massimo
Donato, Nicola
2022 IEEE INTERNATIONAL SYMPOSIUM ON MEDICAL MEASUREMENTS AND APPLICATIONS (MEMEA 2022), 2022,
[48] Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers
Morales, Santiago Omar Caballero
Cox, Stephen J.
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2009,
[49] Dysarthric Speech Recognition Based on Deep Metric Learning
Takashima, Yuki
Takashima, Ryoichi
Takiguchi, Tetsuya
Ariki, Yasuo
INTERSPEECH 2020, 2020, : 4796 - 4800
[50] ON THE USE OF HIDDEN MARKOV MODELING FOR RECOGNITION OF DYSARTHRIC SPEECH
DELLER, JR
HSU, D
FERRIER, LJ
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 1991, 35 (02) : 125 - 139

← 1 2 3 4 5 →