Syllable HMM based Mandarin TTS and Comparison with Concatenative TTS

被引:0
|
作者
Shuang, Zhiwei [1 ,2 ]
Kang, Shiyin [3 ]
Shi, Qin [2 ]
Qin, Yong [2 ]
Cai, Lianhong [3 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] IBM China Res Lab, Beijing, Peoples R China
[3] Tsinghua Univ, Dept Comp Sci, Beijing, Peoples R China
关键词
Mandarin; syllable; HMM; TTS; synthesis;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a Syllable HMM based Mandarin ITS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results show that the Syllable HMM based Mandarin TTS system with a 5.3MB's model size can achieve an overall quality close to a concatenative ITS system with 1GB' data size.
引用
收藏
页码:1755 / +
页数:2
相关论文
共 50 条
  • [41] Speaker Adaptation using Relevance Vector Regression for HMM-based Expressive TTS
    Hong, Doo Hwa
    Lee, Joun Yeop
    Jang, Se Young
    Kim, Nam Soo
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1216 - 1220
  • [42] Decision Tree Based Context Clustering with Cross Likelihood Ratio for HMM-based TTS
    Jung, Chi-Sang
    Kang, Hong-Goo
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2013, 32 (02): : 174 - 180
  • [43] F0 parameterization of glottalized tones for HMM-based Vietnamese TTS
    Ninh, Duy Khanh
    Yamashita, Yoichi
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2202 - 2206
  • [44] Hybrid model method for automatic segmentation of mandarin TTS corpus
    Yuan, Xiaoliang
    Dong, Yuan
    Huang, Dezhi
    Guo, Jun
    Wang, Haila
    INTELLIGENT COMPUTING IN SIGNAL PROCESSING AND PATTERN RECOGNITION, 2006, 345 : 906 - 912
  • [45] Improving Naturalness of HMM-Based TTS Trained with Limited Data by Temporal Decomposition
    Trung-Nghia Phung
    Thanh-Son Phan
    Thang Tat Vu
    Mai Chi Luong
    Akagi, Masato
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (11): : 2417 - 2426
  • [46] Relative functional comparison of neural and non-neural approaches for syllable segmentation in Devnagari TTS system
    Kawachale, S., 1600, International Journal of Computer Science Issues (IJCSI) (09): : 3 - 2
  • [47] Tree-guided transformation-based homograph disambiguation in Mandarin TTS system
    Liu, Fangzhou
    Shi, Qin
    Tao, Jianhua
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4657 - +
  • [48] Script design based on decision tree with context vector and acoustic distance for mandarin TTS
    Cui, Dandan
    Huang, Dezhi
    Dong, Yuan
    Cai, Lianhong
    Wang, Haila
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 713 - +
  • [49] Grapheme-to-phoneme conversion based on a fast TBL algorithm in mandarin TTS systems
    Zheng, M
    Shi, Q
    Zhang, W
    Cai, LH
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 600 - 609
  • [50] WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
    Ma, Linhan
    Guo, Dake
    Song, Kun
    Jiang, Yuepeng
    Wang, Shuai
    Xue, Liumeng
    Xu, Weiming
    Zhao, Huan
    Zhang, Binbin
    Xie, Lei
    INTERSPEECH 2024, 2024, : 1840 - 1844