UNSUPERVISED CROSS-LINGUAL SPEAKER ADAPTATION FOR HMM-BASED SPEECH SYNTHESIS USING TWO-PASS DECISION TREE CONSTRUCTION

被引:9
|
作者
Gibson, Matthew [1 ]
Hirsimaki, Teemu [2 ]
Karhila, Reima [2 ]
Kurimo, Mikko [2 ]
Byrne, William [1 ]
机构
[1] Univ Cambridge, Dept Engn, Trumpington St, Cambridge CB2 1PZ, England
[2] Aalto Univ, FIN-5400 Helsinki, Finland
关键词
HMM-based speech synthesis; unsupervised speaker adaptation; cross-lingual;
D O I
10.1109/ICASSP.2010.5495196
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper demonstrates how unsupervised cross-lingual adaptation of HMM-based speech synthesis models may be performed without explicit knowledge of the adaptation data language. A two-pass decision tree construction technique is deployed for this purpose. Using parallel translated datasets, cross-lingual and intralingual adaptation are compared in a controlled manner. Listener evaluations reveal that the proposed method delivers performance approaching that of unsupervised intralingual adaptation.
引用
收藏
页码:4642 / 4645
页数:4
相关论文
共 50 条
  • [1] Unsupervised Intralingual and Cross-Lingual Speaker Adaptation for HMM-Based Speech Synthesis Using Two-Pass Decision Tree Construction
    Gibson, Matthew
    Byrne, William
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 895 - 904
  • [2] UNSUPERVISED CROSS-LINGUAL SPEAKER ADAPTATION FOR HMM-BASED SPEECH SYNTHESIS
    Oura, Keiichiro
    Tokuda, Keiichi
    Yamagishi, Junichi
    King, Simon
    Wester, Mirjam
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4594 - 4597
  • [3] Two-pass decision tree construction for unsupervised adaptation of HMM-based synthesis models
    Gibson, Matthew
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1763 - 1766
  • [4] CROSS-LINGUAL SPEAKER ADAPTATION FOR HMM-BASED SPEECH SYNTHESIS
    Wu, Yi-Jian
    King, Simon
    Tokuda, Keiichi
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 9 - 12
  • [5] A COMPARISON OF SUPERVISED AND UNSUPERVISED CROSS-LINGUAL SPEAKER ADAPTATION APPROACHES FOR HMM-BASED SPEECH SYNTHESIS
    Liang, Hui
    Dines, John
    Saheer, Lakshmi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4598 - 4601
  • [6] Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis
    Dines, John
    Liang, Hui
    Saheer, Lakshmi
    Gibson, Matthew
    Byrne, William
    Oura, Keiichiro
    Tokuda, Keiichi
    Yamagishi, Junichi
    King, Simon
    Wester, Mirjam
    Hirsimaki, Teemu
    Karhila, Reima
    Kurimo, Mikko
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (02): : 420 - 437
  • [7] Cross-lingual Speaker Adaptation for HMM-based Speech Synthesis based on Perceptual Characteristics and Speaker Interpolation
    Oliveira, Viviane de Franca
    Shiota, Sayaka
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 982 - 985
  • [8] Analysis of unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using KLD-based transform mapping
    Oura, Keiichiro
    Yamagishi, Junichi
    Wester, Mirjam
    King, Simon
    Tokuda, Keiichi
    SPEECH COMMUNICATION, 2012, 54 (06) : 703 - 714
  • [9] State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis
    Wu, Yi-Jian
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 516 - 519
  • [10] Transform Mapping Using Shared Decision Tree Context Clustering for HMM-Based Cross-Lingual Speech Synthesis
    Nagahama, Daiki
    Nose, Takashi
    Koriyama, Tomoki
    Kobayashi, Takao
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 770 - 774