Syllable HMM based Mandarin TTS and Comparison with Concatenative TTS

被引:0
|
作者
Shuang, Zhiwei [1 ,2 ]
Kang, Shiyin [3 ]
Shi, Qin [2 ]
Qin, Yong [2 ]
Cai, Lianhong [3 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] IBM China Res Lab, Beijing, Peoples R China
[3] Tsinghua Univ, Dept Comp Sci, Beijing, Peoples R China
关键词
Mandarin; syllable; HMM; TTS; synthesis;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a Syllable HMM based Mandarin ITS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results show that the Syllable HMM based Mandarin TTS system with a 5.3MB's model size can achieve an overall quality close to a concatenative ITS system with 1GB' data size.
引用
收藏
页码:1755 / +
页数:2
相关论文
共 50 条
  • [21] Rich Context Modeling for High Quality HMM-Based TTS
    Yan, Zhi-Jie
    Qian, Yao
    Soong, Frank K.
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1767 - 1770
  • [22] HMM-based TTS for Hanoi Vietnamese: issues in design and evaluation
    Nguyen Thi Thu Trang
    D'Alessandro, Christophe
    Rilliard, Albert
    Tran Do Dat
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2310 - 2314
  • [23] Building HMM-TTS Voices on Diverse Data
    Wan, Vincent
    Latorre, Javier
    Yanagisawa, Kayoko
    Braunschweiler, Norbert
    Chen, Langzhou
    Gales, Mark J. F.
    Akamine, Masami
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (02) : 296 - 306
  • [24] Automatic Stress Annotation and Prediction for Expressive Mandarin TTS
    He, Wendi
    Lin, Yiting
    Ye, Jianhao
    Zhou, Hongbin
    Ren, Kaimeng
    He, Tianwei
    Tan, Pengfei
    Lu, Heng
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2022, 2023, 1765 : 306 - 317
  • [25] A SPEAKING RATE-CONTROLLED MANDARIN TTS SYSTEM
    Hsieh, Chiao-Hua
    Wang, Yih-Ru
    Chiang, Chen-Yu
    Chen, Sin-Horng
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6900 - 6904
  • [26] Speech Database Compacted for an Embedded Mandarin TTS System
    Guo, Qing
    Wang, Bin
    Katae, Nobuyuki
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 249 - 252
  • [27] Improved Grapheme-to-Phoneme Conversion for Mandarin TTS
    易立夫
    李健
    郝杰
    熊子瑜
    TsinghuaScienceandTechnology, 2009, 14 (05) : 606 - 611
  • [28] Reducing Computational and Memory Cost for HMM-based Embedded TTS System
    Fu, Rong
    Zhao, Zengliang
    Tu, Qixiong
    2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL I, 2010, : 351 - 354
  • [29] Multi-Centroidal Duration Generation Algorithm for HMM-Based TTS
    Kang, Yongguo
    Li, Jian
    Deng, Yan
    Wang, Miaomiao
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1539 - 1542
  • [30] CROSS-VALIDATION BASED DECISION TREE CLUSTERING FOR HMM-BASED TTS
    Zhang, Yu
    Yan, Zhi-Jie
    Soong, Frank K.
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4602 - 4605