Syllable HMM based Mandarin TTS and Comparison with Concatenative TTS

被引：0

作者：

Shuang, Zhiwei ^{[1
,2
]}

Kang, Shiyin ^{[3
]}

Shi, Qin ^{[2
]}

Qin, Yong ^{[2
]}

Cai, Lianhong ^{[3
]}

机构：

[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China

[2] IBM China Res Lab, Beijing, Peoples R China

[3] Tsinghua Univ, Dept Comp Sci, Beijing, Peoples R China

来源：

INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年

关键词：

Mandarin; syllable; HMM; TTS; synthesis;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper introduces a Syllable HMM based Mandarin ITS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results show that the Syllable HMM based Mandarin TTS system with a 5.3MB's model size can achieve an overall quality close to a concatenative ITS system with 1GB' data size.

引用

页码：1755 / +

页数：2

共 50 条

[41] Speaker Adaptation using Relevance Vector Regression for HMM-based Expressive TTS
Hong, Doo Hwa
Lee, Joun Yeop
Jang, Se Young
Kim, Nam Soo
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1216 - 1220
[42] Decision Tree Based Context Clustering with Cross Likelihood Ratio for HMM-based TTS
Jung, Chi-Sang
Kang, Hong-Goo
JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2013, 32 (02): : 174 - 180
[43] F0 parameterization of glottalized tones for HMM-based Vietnamese TTS
Ninh, Duy Khanh
Yamashita, Yoichi
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2202 - 2206
[44] Hybrid model method for automatic segmentation of mandarin TTS corpus
Yuan, Xiaoliang
Dong, Yuan
Huang, Dezhi
Guo, Jun
Wang, Haila
INTELLIGENT COMPUTING IN SIGNAL PROCESSING AND PATTERN RECOGNITION, 2006, 345 : 906 - 912
[45] Improving Naturalness of HMM-Based TTS Trained with Limited Data by Temporal Decomposition
Trung-Nghia Phung
Thanh-Son Phan
Thang Tat Vu
Mai Chi Luong
Akagi, Masato
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (11): : 2417 - 2426
[46] Relative functional comparison of neural and non-neural approaches for syllable segmentation in Devnagari TTS system
Kawachale, S., 1600, International Journal of Computer Science Issues (IJCSI) (09): : 3 - 2
[47] Tree-guided transformation-based homograph disambiguation in Mandarin TTS system
Liu, Fangzhou
Shi, Qin
Tao, Jianhua
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4657 - +
[48] Script design based on decision tree with context vector and acoustic distance for mandarin TTS
Cui, Dandan
Huang, Dezhi
Dong, Yuan
Cai, Lianhong
Wang, Haila
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 713 - +
[49] Grapheme-to-phoneme conversion based on a fast TBL algorithm in mandarin TTS systems
Zheng, M
Shi, Q
Zhang, W
Cai, LH
FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 600 - 609
[50] WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark
Ma, Linhan
Guo, Dake
Song, Kun
Jiang, Yuepeng
Wang, Shuai
Xue, Liumeng
Xu, Weiming
Zhao, Huan
Zhang, Binbin
Xie, Lei
INTERSPEECH 2024, 2024, : 1840 - 1844

← 1 2 3 4 5 →