Superpositional HMM-based intonation synthesis using a functional F0 model

被引：0

作者：

Ni, Jinfu ^{[1
]}

Shiga, Yoshinori ^{[1
]}

Hori, Chiori ^{[1
]}

机构：

[1] Natl Inst Informat & Commun Technol, Spoken Language Commun Lab, Univ Commun Res Inst, Kyoto, Japan

来源：

2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2014年

关键词：

Intonation synthesis; HMM-based speech synthesis; functional F0 model; making focal prominence; prosody; AUTOMATIC EXTRACTION; SPEECH;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper addresses intonation synthesis combining statistical and functional approach with manipulation of fundamental frequency (F-0) contours in HMM-based speech synthesis. An F-0 contour is represented as a sum of micro, accent, and register components at the logarithmic scale, which is rooted in the Fujisaki model. Separated context-dependent (CD) HMMs are trained for each type of components extracted from a speech corpus based on a functional F-0 model. At the phase of synthesis, CDHMM-generated micro, accent, and register components are superimposed to form F-0 contours for input text. Objective and subjective evaluations are carried out on a Japanese speech corpus. Compared with the conventional approach, this method not only demonstrates the improved performance in naturalness of synthetic speech by achieving better global F-0 behaviors but also shows its flexibility for intonation manipulation through modifying the functional model parameters.

引用

页码：270 / 274

页数：5

共 50 条

[21] Prosody modeling from tone to intonation in Chinese using a functional F0 model
Ni, Jinfu
Sakai, Shinsuke
Shimizu, Tohru
Nakamura, Satoshi
PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON UNIVERSAL COMMUNICATION, 2008, : 397 - 404
[22] Data-driven generation of F0 contours using a superpositional model
Sakurai, A
Hirose, K
Minematsu, N
SPEECH COMMUNICATION, 2003, 40 (04) : 535 - 549
[23] IMPROVED MODELING FOR F0 GENERATION AND V/U DECISION IN HMM-BASED TTS
Zhang, Qingqing
Soong, Frank
Qian, Yao
Yan, Zhijie
Pan, Jielin
Yan, Yonghong
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4606 - 4609
[24] A Minimum V/U Error Approach to F0 Generation in HMM-based TTS
Qian, Yao
Soong, Frank
Wang, Miaomiao
Wu, Zhizheng
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 400 - 403
[25] A Targets-based Superpositional Model of Fundamental Frequency Contours Applied to HMM-based Speech Synthesis
Ni, Jinfu
Shiga, Yoshinori
Hori, Chiori
Kidawara, Yutaka
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1051 - 1055
[26] A method for generation of Mandarin F0 contours based on tone nucleus model and superpositional model
Sun, Qinghua
Hirose, Keikichi
Minematsu, Nobuaki
SPEECH COMMUNICATION, 2012, 54 (08) : 932 - 945
[27] A model for the f0 reset in corpus-based intonation approaches
Campillo, Francisco
van Santen, Jan
Banga, Eduardo R.
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2362 - +
[28] Review of F0 modelling and generation in HMM based speech synthesis
Yu, Kai
PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 599 - 604
[29] F-0 contour generation and synthesis using Bengali Hmm-based speech synthesis system
Mukherjee, Sankar
Das Mandal, Shyamal Kumar
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2015, 18 (01) : 25 - 36
[30] Analysis of Chinese Interrogative Intonation and its Synthesis in HMM-Based Synthesis System
Wang, Yongxin
Jia, Jia
Cai, Lianhong
2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL V, 2010, : 176 - 179

← 1 2 3 4 5 →