Syllable HMM based Mandarin TTS and Comparison with Concatenative TTS

被引:0
|
作者
Shuang, Zhiwei [1 ,2 ]
Kang, Shiyin [3 ]
Shi, Qin [2 ]
Qin, Yong [2 ]
Cai, Lianhong [3 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] IBM China Res Lab, Beijing, Peoples R China
[3] Tsinghua Univ, Dept Comp Sci, Beijing, Peoples R China
关键词
Mandarin; syllable; HMM; TTS; synthesis;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a Syllable HMM based Mandarin ITS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results show that the Syllable HMM based Mandarin TTS system with a 5.3MB's model size can achieve an overall quality close to a concatenative ITS system with 1GB' data size.
引用
收藏
页码:1755 / +
页数:2
相关论文
共 50 条
  • [31] Reducing Computational and Memory Cost for HMM-Based Embedded TTS System
    Fu, Rong
    Zhao, Zengliang
    Tu, Qixiong
    APPLIED INFORMATICS AND COMMUNICATION, PT I, 2011, 224 : 602 - +
  • [32] Speech factorization for HMM-TTS based on cluster adaptive training.
    Latorre, Javier
    Wan, Vincent
    Gales, Mark J. F.
    Chen, Langzhou
    Chin, K. K.
    Knill, Kate
    Akamine, Masami
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 970 - 973
  • [33] Speech-rate-variable HMM-based Japanese TTS system
    Iwano, K
    Yamada, M
    Togawa, T
    Furui, S
    PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 219 - 222
  • [34] Formant-based Frequency Warping for Improving Speaker Adaptation in HMM TTS
    Zhuang, Xin
    Qian, Yao
    Soong, Frank
    Wu, Yijian
    Zhang, Bo
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 817 - +
  • [35] Applying a speaker-dependent speech compression technique to concatenative TTS synthesizers
    Lee, Chang-Heon
    Jung, Sung-Kyo
    Kang, Hong-Goo
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (02): : 632 - 640
  • [36] Design and evaluation of prosodically-sensitive concatenative units for a Korean TTS system
    Yoon, Kyuchul
    COMPUTER SPEECH AND LANGUAGE, 2008, 22 (03): : 273 - 294
  • [37] Intonation Modeling Using Linguistic, Production and Prosodic Constraints for Syllable based TTS Systems
    Reddy, V. Ramu
    Rao, K. Sreenivasa
    INTERNATIONAL CONFERENCE ON MODELLING OPTIMIZATION AND COMPUTING, 2012, 38 : 2772 - 2783
  • [38] 基于HMM的藏语TTS语调韵律预测
    赵颖
    西南民族大学学报(自然科学版), 2010, 36 (06) : 1060 - 1062
  • [39] An HMM Trajectory Tiling (HTT) Approach to High Quality TTS
    Qian, Yao
    Yan, Zhi-jie
    Wu, Yijian
    Soong, Frank
    Zhuang, Xin
    Kong, Shengyi
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 422 - +
  • [40] Waveform Interpolation-Based Speech Analysis/Synthesis for HMM-Based TTS Systems
    Jung, Chi-Sang
    Joo, Young-Sun
    Kang, Hong-Goo
    IEEE SIGNAL PROCESSING LETTERS, 2012, 19 (12) : 809 - 812