Superpositional HMM-based intonation synthesis using a functional F0 model

被引：0

作者：

Ni, Jinfu ^{[1
]}

Shiga, Yoshinori ^{[1
]}

Hori, Chiori ^{[1
]}

机构：

[1] Natl Inst Informat & Commun Technol, Spoken Language Commun Lab, Univ Commun Res Inst, Kyoto, Japan

来源：

2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2014年

关键词：

Intonation synthesis; HMM-based speech synthesis; functional F0 model; making focal prominence; prosody; AUTOMATIC EXTRACTION; SPEECH;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper addresses intonation synthesis combining statistical and functional approach with manipulation of fundamental frequency (F-0) contours in HMM-based speech synthesis. An F-0 contour is represented as a sum of micro, accent, and register components at the logarithmic scale, which is rooted in the Fujisaki model. Separated context-dependent (CD) HMMs are trained for each type of components extracted from a speech corpus based on a functional F-0 model. At the phase of synthesis, CDHMM-generated micro, accent, and register components are superimposed to form F-0 contours for input text. Objective and subjective evaluations are carried out on a Japanese speech corpus. Compared with the conventional approach, this method not only demonstrates the improved performance in naturalness of synthetic speech by achieving better global F-0 behaviors but also shows its flexibility for intonation manipulation through modifying the functional model parameters.

引用

页码：270 / 274

页数：5

共 50 条

[31] Continuous F0 Modeling for HMM Based Statistical Parametric Speech Synthesis
Yu, Kai
Young, Steve
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (05): : 1071 - 1079
[32] PROBABLISTIC MODELLING OF F0 IN UNVOICED REGIONS IN HMM BASED SPEECH SYNTHESIS
Yu, K.
Toda, T.
Gasic, M.
Keizer, S.
Mairesse, F.
Thomson, B.
Young, S.
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3773 - +
[33] An acoustic model adaptation using hmm-based speech synthesis
Tanaka, K
Kuroiwa, S
Tsuge, S
Ren, F
2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 368 - 373
[34] Tuning Intonation with Pitch Accent Decomposition for HMM-based Expressive Speech Synthesis
Ni, Jinfu
Shiga, Yoshinori
Hori, Chiori
2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
[35] HMM-based Indonesian Speech Synthesis System with Declarative and Question Sentences Intonation
Cahyaningtyas, Elok
Arifianto, Dhany
2015 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS), 2015, : 153 - 158
[36] JOINT MODELLING OF VOICING LABEL AND CONTINUOUS F0 FOR HMM BASED SPEECH SYNTHESIS
Yu, K.
Young, S.
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4572 - 4575
[37] Combining Atom Decomposition of the F0 Track and HMM-based Phonological Phrase Modelling for Robust Stress Detection in Speech
Szaszak, Gyorgy
Tundik, Mate Akos
Gerazov, Branislav
Gjoreski, Aleksandar
SPEECH AND COMPUTER, 2016, 9811 : 165 - 173
[38] CONTINUOUS F0 IN THE SOURCE-EXCITATION GENERATION FOR HMM-BASED TTS: DO WE NEED VOICEDIUNVOICED CLASSIFICATION?
Latorre, Javier
Gales, Mark J. F.
Buchholz, Sabine
Knill, Kate
Tamura, Masatsune
Ohtani, Yamato
Akamine, Masami
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4724 - 4727
[39] Unsupervised HMM classification of F0 curves
Lolive, Damien
Barbot, Nelly
Boeffard, Olivier
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2073 - 2076
[40] An F0 contour control model using an F0 contour codebook
Kagoshima, Takehiko
Morita, Masahiro
Seto, Shigenobu
Akamine, Masami
Shiga, Yoshinori
Systems and Computers in Japan, 2007, 38 (01): : 62 - 72

← 1 2 3 4 5 →