Adaptation of Prosody in Speech Synthesis by Changing Command Values of the Generation Process Model of Fundamental Frequency

被引:0
|
作者
Hirose, Keikichi [1 ]
Ochi, Keiko [1 ]
Mihara, Ryusuke [1 ]
Hashimoto, Hiroya [1 ]
Saito, Daisuke [1 ]
Minematsu, Nobuaki [1 ]
机构
[1] Univ Tokyo, Dept Informat & Commun Engn, Tokyo, Japan
关键词
prosody adaptation; generation process model; speech synthesis; PARAMETERS; CONTOURS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A method was developed to adapt prosody to a new speaker/style in speech synthesis. It is based on predicting differences between target and original speakers/styles and applying them to the original one. Differences in fundamental frequency (F-0) contours are represented in the framework of the generation process model; differences in the command magnitudes/amplitudes. While the original one requires a certain amount of training corpus, while corpus for training command differences can be small. Furthermore, in the case of style adaptation, it is not necessarily the corpus being uttered by the same speaker of the original style. Speech synthesis was conducted using HMM-based speech synthesis system, where prosody was controlled by the method. Listening experiments on synthetic speech with style adaptation and voice conversion both showed the validity of the method.
引用
收藏
页码:2804 / +
页数:2
相关论文
共 50 条
  • [1] Control of Fundamental Frequency Contours Using the Generation Process Model in HMM-Based Speech Synthesis
    Matsuda, Tetsuya
    Hirose, Keikichi
    Minematsu, Nobuaki
    2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 617 - 620
  • [2] Use of Generation Process Model for Synthesizing Fundamental Frequency Contours in HMM-based Speech Synthesis
    Hirose, Keikichi
    Hashimoto, Hiroya
    Ikeshima, Jun
    Minematsu, Nobuaki
    PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 575 - +
  • [3] REPRESENTING FUNDAMENTAL FREQUENCY CONTOURS GENERATED BY HMM-BASED SPEECH SYNTHESIS USING GENERATION PROCESS MODEL
    Hirose, Keikichi
    Matsuda, Tatsuya
    Hashimoto, Hiroya
    Minematsu, Nobuaki
    2011 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2011,
  • [4] Applying generation process model constraint to fundamental frequency contours generated by hidden-Markov-model-based speech synthesis
    Matsuda, Tetsuya
    Hirose, Keikichi
    Minematsu, Nobuaki
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2012, 33 (04) : 221 - 228
  • [5] Generation of Fundamental Frequency Contours for Thai Speech Synthesis using Tone Nucleus Model
    Krityakien, Oraphan
    Hirose, Keikichi
    Minematsu, Nobuaki
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1036 - 1040
  • [6] Fundamental Frequency Contour Reshaping in HMM-based Speech Synthesis and Realization of Prosodic Focus Using Generation Process Model
    Hirose, Keikichi
    Hashimoto, Hiroya
    Ikeshima, Jun
    Minematsu, Nobuaki
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 171 - 174
  • [7] USE OF FUNDAMENTAL FREQUENCIES SHAPED BY GENERATION PROCESS MODEL FOR HMM-BASED SPEECH SYNTHESIS
    Hirose, Keikichi
    Hashimoto, Hiroya
    Hyakutake, Kyota
    Saito, Daisuke
    Minematsu, Nobuaki
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 555 - 560
  • [8] A dynamical system model for generating fundamental frequency for speech synthesis
    Ross, KN
    Ostendorf, M
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (03): : 295 - 309
  • [9] Improved Generation of Fundemental Frequency in HMM-Based Speech Synthesis Using Generation Process Model
    Wang, Miaomiao
    Wen, Miaomiao
    Hirose, Keikichi
    Minematsu, Nobuaki
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2166 - +
  • [10] Improved Automatic Extraction of Generation Process Model Commands and Its use for Generating Fundamental Frequency Contours for Training HMM-based Speech Synthesis
    Hashimoto, Hiroya
    Hirose, Keikichi
    Minematsu, Nobuaki
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 458 - 461