An Analysis of Language Mismatch in HMM State Mapping-Based Cross-Lingual Speaker Adaptation

被引:0
|
作者
Liang, Hui [1 ]
Dines, John [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
关键词
HMM-based TTS; cross-lingual speaker adaptation; HMM state mapping; language mismatch;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper provides an in-depth analysis of the impacts of language mismatch on the performance of cross-lingual speaker adaptation. Our work confirms the influence of language mismatch between average voice distributions for synthesis and for transform estimation and the necessity of eliminating this mismatch in order to effectively utilize multiple transforms for cross-lingual speaker adaptation. Specifically, we show that language mismatch introduces unwanted language-specific information when estimating multiple transforms, thus making these transforms detrimental to adaptation performance. Our analysis demonstrates speaker characteristics should be separated from language characteristics in order to improve cross-lingual adaptation performance.
引用
收藏
页码:622 / 625
页数:4
相关论文
共 50 条
  • [1] Phonological Knowledge Guided HMM State Mapping for Cross-Lingual Speaker Adaptation
    Liang, Hui
    Dines, John
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1836 - +
  • [2] State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis
    Wu, Yi-Jian
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 516 - 519
  • [3] CROSS-LINGUAL SPEAKER ADAPTATION FOR HMM-BASED SPEECH SYNTHESIS
    Wu, Yi-Jian
    King, Simon
    Tokuda, Keiichi
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 9 - 12
  • [4] UNSUPERVISED CROSS-LINGUAL SPEAKER ADAPTATION FOR HMM-BASED SPEECH SYNTHESIS
    Oura, Keiichiro
    Tokuda, Keiichi
    Yamagishi, Junichi
    King, Simon
    Wester, Mirjam
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4594 - 4597
  • [5] Analysis of unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using KLD-based transform mapping
    Oura, Keiichiro
    Yamagishi, Junichi
    Wester, Mirjam
    King, Simon
    Tokuda, Keiichi
    SPEECH COMMUNICATION, 2012, 54 (06) : 703 - 714
  • [6] Cross-lingual Speaker Adaptation for HMM-based Speech Synthesis based on Perceptual Characteristics and Speaker Interpolation
    Oliveira, Viviane de Franca
    Shiota, Sayaka
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 982 - 985
  • [7] Cross-lingual Speaker Adaptation via Gaussian Component Mapping
    Cao, Houwei
    Lee, Tan
    Ching, P. C.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 869 - 872
  • [8] A COMPARISON OF SUPERVISED AND UNSUPERVISED CROSS-LINGUAL SPEAKER ADAPTATION APPROACHES FOR HMM-BASED SPEECH SYNTHESIS
    Liang, Hui
    Dines, John
    Saheer, Lakshmi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4598 - 4601
  • [9] Cross-lingual speaker adaptation for HMM-based speech synthesis considering differences between language-dependent average voices
    Peng, Xianglin
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 605 - 608
  • [10] Using Eigenvoices and Nearest-Neighbors in HMM-Based Cross-Lingual Speaker Adaptation With Limited Data
    Sarfjoo, Seyyed Saeed
    Demiroglu, Cenk
    King, Simon
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 839 - 851