MULTILINGUAL DEEP NEURAL NETWORK BASED ACOUSTIC MODELING FOR RAPID LANGUAGE ADAPTATION

被引:0
|
作者
Ngoc Thang Vu [1 ]
Imseng, David
Povey, Daniel
Motlicek, Petr
Schultz, Tanja [1 ]
Bourlard, Herve
机构
[1] Karlsruhe Inst Technol, D-76021 Karlsruhe, Germany
关键词
Multilingual DNN; phone merging; rapid language adaptation; KL-HMM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a study on multilingual deep neural network (DNN) based acoustic modeling and its application to new languages. We investigate the effect of phone merging on multilingual DNN in context of rapid language adaptation. Moreover, the combination of multilingual DNNs with Kullback-Leibler divergence based acoustic modeling (KL-HMM) is explored. Using ten different languages from the Globalphone database, our studies reveal that crosslingual acoustic model transfer through multilingual DNNs is superior to unsupervised RBM pre-training and greedy layer-wise supervised training. We also found that KL-HMM based decoding consistently outperforms conventional hybrid decoding, especially in low-resource scenarios. Furthermore, the experiments indicate that multilingual DNN training equally benefits from simple phoneset concatenation and manually derived universal phonesets.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Multilingual Multilayer Perceptron For Rapid Language Adaptation Between and Across Language Families
    Ngoc Thang Vu
    Schultz, Tanja
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 515 - 519
  • [22] An audio based piano performance evaluation method using deep neural network based acoustic modeling
    Pan, Jing
    Li, Ming
    Song, Zhanmei
    Li, Xin
    Liu, Xiaolin
    Yi, Hua
    Zhu, Manman
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3088 - 3092
  • [23] SPEAKER ADAPTATION OF A MULTILINGUAL ACOUSTIC MODEL FOR CROSS-LANGUAGE SYNTHESIS
    Himawan, Ivan
    Aryal, Sandesh
    Ouyang, Iris
    Kang, Sam
    Lanchantin, Pierre
    King, Simon
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7629 - 7633
  • [24] SPEAKER CLUSTER-BASED SPEAKER ADAPTIVE TRAINING FOR DEEP NEURAL NETWORK ACOUSTIC MODELING
    Chu, Wei
    Chen, Ruxin
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5295 - 5299
  • [25] Multilingual Acoustic and Language Modeling for Ethio-Semitic Languages
    Abate, Solomon Teferra
    Tachbelie, Martha Yifiru
    Schultz, Tanja
    INTERSPEECH 2020, 2020, : 1047 - 1051
  • [26] MULTILINGUAL ACOUSTIC MODELS USING DISTRIBUTED DEEP NEURAL NETWORKS
    Heigold, G.
    Vanhoucke, V.
    Senior, A.
    Nguyen, P.
    Ranzato, M.
    Devin, M.
    Dean, J.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8619 - 8623
  • [27] Improving Russian LVCSR Using Deep Neural Networks for Acoustic and Language Modeling
    Kipyatkova, Irina
    SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 291 - 300
  • [28] "Multilingual" Deep Neural Network For Music Genre Classification
    Dai, Jia
    Liu, Wenju
    Ni, Chongjia
    Dong, Like
    Yang, Hong
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2907 - 2911
  • [29] FACTORIZED ADAPTATION FOR DEEP NEURAL NETWORK
    Li, Jinyu
    Huang, Jui-Ting
    Gong, Yifan
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [30] Deep neural network based pier scour modeling
    Pal M.
    ISH Journal of Hydraulic Engineering, 2022, 28 (S1) : 80 - 85