An Analysis of Language Mismatch in HMM State Mapping-Based Cross-Lingual Speaker Adaptation

被引:0
|
作者
Liang, Hui [1 ]
Dines, John [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
关键词
HMM-based TTS; cross-lingual speaker adaptation; HMM state mapping; language mismatch;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper provides an in-depth analysis of the impacts of language mismatch on the performance of cross-lingual speaker adaptation. Our work confirms the influence of language mismatch between average voice distributions for synthesis and for transform estimation and the necessity of eliminating this mismatch in order to effectively utilize multiple transforms for cross-lingual speaker adaptation. Specifically, we show that language mismatch introduces unwanted language-specific information when estimating multiple transforms, thus making these transforms detrimental to adaptation performance. Our analysis demonstrates speaker characteristics should be separated from language characteristics in order to improve cross-lingual adaptation performance.
引用
收藏
页码:622 / 625
页数:4
相关论文
共 50 条
  • [21] Cross-lingual Speaker Adaptation using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis
    Xin, Detai
    Saito, Yuki
    Takamichi, Shinnosuke
    Koriyama, Tomoki
    Saruwatari, Hiroshi
    INTERSPEECH 2021, 2021, : 1614 - 1618
  • [22] Cross-lingual speaker adaptation using domain adaptation and speaker consistency loss for text-to-speech synthesis
    Xin, Detai
    Saito, Yuki
    Takamichi, Shinnosuke
    Koriyama, Tomoki
    Saruwatari, Hiroshi
    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2021, 5 : 3376 - 3380
  • [23] Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model
    Li, Juntao
    He, Ruidan
    Ye, Hai
    Ng, Hwee Tou
    Bing, Lidong
    Yan, Rui
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3672 - 3678
  • [24] Cross-Lingual Speaker Adaptation for Statistical Speech Synthesis Using Limited Data
    Saffjoo, Seyyed Saeed
    Demiroglu, Cenk
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 317 - 321
  • [25] TACKLING THE SCORE SHIFT IN CROSS-LINGUAL SPEAKER VERIFICATION BY EXPLOITING LANGUAGE INFORMATION
    Thienpondt, Jenthe
    Desplanques, Brecht
    Demuynck, Kris
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7187 - 7191
  • [26] Language adaptation experiments via cross-lingual embeddings for related languages
    Sharoff, Serge
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 844 - 849
  • [27] Cross-lingual latent semantic analysis for language modeling
    Kim, W
    Khudanpur, S
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 257 - 260
  • [28] Modeling Language Discrepancy for Cross-Lingual Sentiment Analysis
    Chen, Qiang
    Li, Chenliang
    Li, Wenjie
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 117 - 126
  • [29] CROSS-LINGUAL PHONEME MAPPING FOR LANGUAGE ROBUST CONTEXTUAL SPEECH RECOGNITION
    Patel, Ami
    Li, David
    Cho, Eunjoon
    Aleksic, Petar
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5924 - 5928
  • [30] Domain Adaptation and Language Conditioning to Improve Phonetic Posteriorgram Based Cross-Lingual Voice Conversion
    Hsu, Pin-Chieh
    Minematsu, Nobuaki
    Saito, Daisuke
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 950 - 956