MULTILINGUAL DEEP NEURAL NETWORK BASED ACOUSTIC MODELING FOR RAPID LANGUAGE ADAPTATION

被引:0
|
作者
Ngoc Thang Vu [1 ]
Imseng, David
Povey, Daniel
Motlicek, Petr
Schultz, Tanja [1 ]
Bourlard, Herve
机构
[1] Karlsruhe Inst Technol, D-76021 Karlsruhe, Germany
关键词
Multilingual DNN; phone merging; rapid language adaptation; KL-HMM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a study on multilingual deep neural network (DNN) based acoustic modeling and its application to new languages. We investigate the effect of phone merging on multilingual DNN in context of rapid language adaptation. Moreover, the combination of multilingual DNNs with Kullback-Leibler divergence based acoustic modeling (KL-HMM) is explored. Using ten different languages from the Globalphone database, our studies reveal that crosslingual acoustic model transfer through multilingual DNNs is superior to unsupervised RBM pre-training and greedy layer-wise supervised training. We also found that KL-HMM based decoding consistently outperforms conventional hybrid decoding, especially in low-resource scenarios. Furthermore, the experiments indicate that multilingual DNN training equally benefits from simple phoneset concatenation and manually derived universal phonesets.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] CROSS-LANGUAGE KNOWLEDGE TRANSFER USING MULTILINGUAL DEEP NEURAL NETWORK WITH SHARED HIDDEN LAYERS
    Huang, Jui-Ting
    Li, Jinyu
    Yu, Dong
    Deng, Li
    Gong, Yifan
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7304 - 7308
  • [32] An Investigation of Deep Neural Networks for Multilingual Speech Recognition Training and Adaptation
    Tong, Sibo
    Garner, Philip N.
    Bourlard, Herve
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 714 - 718
  • [33] A 43 Language Multilingual Punctuation Prediction Neural Network Model
    Li, Xinxing
    Lin, Edward
    INTERSPEECH 2020, 2020, : 1067 - 1071
  • [34] Recognizing Chinese Sign Language Based on Deep Neural Network
    Hu, Xi
    Tan, Liming
    Zhou, Jiayi
    Ali, Shahid
    Yong, Zirui
    Liao, Jun
    Liu, Li
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 4127 - 4133
  • [35] I-vector features and deep neural network modeling for language recognition
    Wang, Wei
    Song, Wenjie
    Chen, Chen
    Zhang, Zhaoxin
    Xin, Yi
    2018 INTERNATIONAL CONFERENCE ON IDENTIFICATION, INFORMATION AND KNOWLEDGE IN THE INTERNET OF THINGS, 2019, 147 : 36 - 43
  • [36] Recurrent Neural Network based Language Modeling in Meeting Recognition
    Kombrink, Stefan
    Mikolov, Tomas
    Karafiat, Martin
    Burget, Lukas
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2888 - 2891
  • [37] Comparison of Regularization Constraints in Deep Neural Network based Speaker Adaptation
    Shen, Peng
    Lu, Xugang
    Kawai, Hisashi
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [38] Design of underwater acoustic touchscreen based on deep convolutional neural network
    Wan, Haopeng
    Chen, Jiaming
    Li, Shuang
    Zou, Jijie
    Jia, Kangning
    Yuan, Peilong
    Sun, Feiyang
    Xu, Xiaodong
    Cheng, Liping
    Fan, Li
    Yan, Xuejun
    Li, Guokuan
    Chen, Xi
    Zhang, Haiou
    APPLIED ACOUSTICS, 2023, 203
  • [39] Exploring GMM-derived Features for Unsupervised Adaptation of Deep Neural Network Acoustic Models
    Tomashenko, Natalia
    Khokhlov, Yuri
    Larcher, Anthony
    Esteve, Yannick
    SPEECH AND COMPUTER, 2016, 9811 : 304 - 311
  • [40] GMM-derived features for effective unsupervised adaptation of deep neural network acoustic models
    Tomashenko, Natalia
    Khokhlov, Yuri
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2882 - 2886