Syllable HMM based Mandarin TTS and Comparison with Concatenative TTS

被引：0

作者：

Shuang, Zhiwei ^{[1
,2
]}

Kang, Shiyin ^{[3
]}

Shi, Qin ^{[2
]}

Qin, Yong ^{[2
]}

Cai, Lianhong ^{[3
]}

机构：

[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China

[2] IBM China Res Lab, Beijing, Peoples R China

[3] Tsinghua Univ, Dept Comp Sci, Beijing, Peoples R China

来源：

INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年

关键词：

Mandarin; syllable; HMM; TTS; synthesis;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper introduces a Syllable HMM based Mandarin ITS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results show that the Syllable HMM based Mandarin TTS system with a 5.3MB's model size can achieve an overall quality close to a concatenative ITS system with 1GB' data size.

引用

页码：1755 / +

页数：2

共 50 条

[31] Reducing Computational and Memory Cost for HMM-Based Embedded TTS System
Fu, Rong
Zhao, Zengliang
Tu, Qixiong
APPLIED INFORMATICS AND COMMUNICATION, PT I, 2011, 224 : 602 - +
[32] Speech factorization for HMM-TTS based on cluster adaptive training.
Latorre, Javier
Wan, Vincent
Gales, Mark J. F.
Chen, Langzhou
Chin, K. K.
Knill, Kate
Akamine, Masami
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 970 - 973
[33] Speech-rate-variable HMM-based Japanese TTS system
Iwano, K
Yamada, M
Togawa, T
Furui, S
PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 219 - 222
[34] Formant-based Frequency Warping for Improving Speaker Adaptation in HMM TTS
Zhuang, Xin
Qian, Yao
Soong, Frank
Wu, Yijian
Zhang, Bo
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 817 - +
[35] Applying a speaker-dependent speech compression technique to concatenative TTS synthesizers
Lee, Chang-Heon
Jung, Sung-Kyo
Kang, Hong-Goo
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (02): : 632 - 640
[36] Design and evaluation of prosodically-sensitive concatenative units for a Korean TTS system
Yoon, Kyuchul
COMPUTER SPEECH AND LANGUAGE, 2008, 22 (03): : 273 - 294
[37] Intonation Modeling Using Linguistic, Production and Prosodic Constraints for Syllable based TTS Systems
Reddy, V. Ramu
Rao, K. Sreenivasa
INTERNATIONAL CONFERENCE ON MODELLING OPTIMIZATION AND COMPUTING, 2012, 38 : 2772 - 2783
[38] 基于HMM的藏语TTS语调韵律预测
赵颖
西南民族大学学报(自然科学版), 2010, 36 (06) : 1060 - 1062
[39] An HMM Trajectory Tiling (HTT) Approach to High Quality TTS
Qian, Yao
Yan, Zhi-jie
Wu, Yijian
Soong, Frank
Zhuang, Xin
Kong, Shengyi
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 422 - +
[40] Waveform Interpolation-Based Speech Analysis/Synthesis for HMM-Based TTS Systems
Jung, Chi-Sang
Joo, Young-Sun
Kang, Hong-Goo
IEEE SIGNAL PROCESSING LETTERS, 2012, 19 (12) : 809 - 812

← 1 2 3 4 5 →