Superpositional HMM-based intonation synthesis using a functional F0 model

被引:0
|
作者
Ni, Jinfu [1 ]
Shiga, Yoshinori [1 ]
Hori, Chiori [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Spoken Language Commun Lab, Univ Commun Res Inst, Kyoto, Japan
关键词
Intonation synthesis; HMM-based speech synthesis; functional F0 model; making focal prominence; prosody; AUTOMATIC EXTRACTION; SPEECH;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses intonation synthesis combining statistical and functional approach with manipulation of fundamental frequency (F-0) contours in HMM-based speech synthesis. An F-0 contour is represented as a sum of micro, accent, and register components at the logarithmic scale, which is rooted in the Fujisaki model. Separated context-dependent (CD) HMMs are trained for each type of components extracted from a speech corpus based on a functional F-0 model. At the phase of synthesis, CDHMM-generated micro, accent, and register components are superimposed to form F-0 contours for input text. Objective and subjective evaluations are carried out on a Japanese speech corpus. Compared with the conventional approach, this method not only demonstrates the improved performance in naturalness of synthetic speech by achieving better global F-0 behaviors but also shows its flexibility for intonation manipulation through modifying the functional model parameters.
引用
收藏
页码:270 / 274
页数:5
相关论文
共 50 条
  • [41] Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis
    Saitou, T
    Unoki, M
    Akagi, M
    SPEECH COMMUNICATION, 2005, 46 (3-4) : 405 - 417
  • [42] HMM-based emotional speech synthesis using average emotion model
    Qin, Long
    Ling, Zhen-Hua
    Wu, Yi-Jian
    Zhang, Bu-Fan
    Wang, Ren-Hua
    CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 233 - +
  • [43] A trainable excitation model for HMM-based speech synthesis
    Maia, R.
    Toda, T.
    Zen, H.
    Nankaku, Y.
    Tokuda, K.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1125 - +
  • [44] A Method for Automatically Estimating F0 Model Parameters and A Speech Re-Synthesis Tool Using F0 Model and STRAIGHT
    Sato, Shota
    Kimura, Taro
    Horiuchi, Yasuo
    Nishida, Masafumi
    Kuroiwa, Shingo
    Ichikawa, Akira
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 545 - +
  • [45] HMM-Based Trust Model
    Elsalamouny, Ehab
    Sassone, Vladimiro
    Nielsen, Mogens
    FORMAL ASPECTS IN SECURITY AND TRUST, 2010, 5983 : 21 - +
  • [46] An HMM-Based Reputation Model
    ElSalamouny, Ehab
    Sassone, Vladimiro
    ADVANCES IN SECURITY OF INFORMATION AND COMMUNICATION NETWORKS, 2013, 381 : 111 - +
  • [47] Intonation Control for Neural Text-to-Speech Synthesis with Polynomial Models of F0
    Corkey, Niamh
    O'Mahony, Johannah
    King, Simon
    INTERSPEECH 2023, 2023, : 2014 - 2015
  • [48] F0 in Lithuanian: The Indicator of Stress, Syllable Accent, or Intonation?
    Kazlauskiene, Asta
    Sabonyte, Regina
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2018, 2018, 307 : 55 - 62
  • [49] Interactive Intonation Optimisation Using CMA-ES and DCT Parameterisation of the F0 Contour for Speech Synthesis
    Stan, Adriana
    Pop, Florin-Claudiu
    Cremene, Marcel
    Giurgiu, Mircea
    Pallez, Denis
    NATURE INSPIRED COOPERATIVE STRATEGIES FOR OPTIMIZATION (NICSO 2011), 2011, 387 : 57 - +
  • [50] F0 declination of intonation groups in Spanish and in Mandarin Chinese
    Yao, Junming
    SPANISH IN CONTEXT, 2019, 16 (03) : 523 - 542