MULTILINGUAL DEEP NEURAL NETWORK BASED ACOUSTIC MODELING FOR RAPID LANGUAGE ADAPTATION

被引:0
|
作者
Ngoc Thang Vu [1 ]
Imseng, David
Povey, Daniel
Motlicek, Petr
Schultz, Tanja [1 ]
Bourlard, Herve
机构
[1] Karlsruhe Inst Technol, D-76021 Karlsruhe, Germany
关键词
Multilingual DNN; phone merging; rapid language adaptation; KL-HMM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a study on multilingual deep neural network (DNN) based acoustic modeling and its application to new languages. We investigate the effect of phone merging on multilingual DNN in context of rapid language adaptation. Moreover, the combination of multilingual DNNs with Kullback-Leibler divergence based acoustic modeling (KL-HMM) is explored. Using ten different languages from the Globalphone database, our studies reveal that crosslingual acoustic model transfer through multilingual DNNs is superior to unsupervised RBM pre-training and greedy layer-wise supervised training. We also found that KL-HMM based decoding consistently outperforms conventional hybrid decoding, especially in low-resource scenarios. Furthermore, the experiments indicate that multilingual DNN training equally benefits from simple phoneset concatenation and manually derived universal phonesets.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] I-vector Based Deep Neural Network Acoustic Model Adaptation Using Multilingual Language Resource
    Xu, Haihua
    Rao, Wei
    Xiao, Xiong
    Huang, Hao
    Chng, Eng-Siong
    Li, Haizhou
    2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [2] Rapid Feature Space MLLR Speaker Adaptation for Deep Neural Network Acoustic Modeling
    Zhang, Shilei
    Qin, Yong
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 2889 - 2894
  • [3] Factorized Hidden Layer Adaptation for Deep Neural Network Based Acoustic Modeling
    Samarakoon, Lahiru
    Sim, Khe Chai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (12) : 2241 - 2250
  • [4] Context adaptive neural network for rapid adaptation of deep CNN based acoustic models
    Delcroix, Marc
    Kinoshita, Keisuke
    Ogawa, Atsunori
    Yoshioka, Takuya
    Tran, Dung
    Nakatani, Tomohiro
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1573 - 1577
  • [5] Context Adaptive Neural Network Based Acoustic Models for Rapid Adaptation
    Delcroix, Marc
    Kinoshita, Keisuke
    Ogawa, Atsunori
    Huemmer, Christian
    Nakatani, Tomohiro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (05) : 895 - 908
  • [6] FAST ADAPTATION ON DEEP MIXTURE GENERATIVE NETWORK BASED ACOUSTIC MODELING
    Ding, Wen
    Tan, Tian
    Qian, Yanmin
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5944 - 5948
  • [7] TOWARDS UTTERANCE-BASED NEURAL NETWORK ADAPTATION IN ACOUSTIC MODELING
    Himawan, Ivan
    Motlicek, Petr
    Font, Marc Ferras
    Madikeri, Srikanth
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 289 - 295
  • [8] FEEDBACK CONNECTION FOR DEEP NEURAL NETWORK-BASED ACOUSTIC MODELING
    Tran, Dung T.
    Delcroix, Marc
    Ogawa, Atsunori
    Huemmer, Christian
    Nakatani, Tomohiro
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5240 - 5244
  • [9] Neural Language Codes for Multilingual Acoustic Models
    Muller, Markus
    Stuker, Sebastian
    Waibel, Alex
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2419 - 2423
  • [10] TIKHONOV REGULARIZATION FOR DEEP NEURAL NETWORK ACOUSTIC MODELING
    Chien, Jen-Tzung
    Lu, Tsai-Wei
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 147 - 152