Speaker Adaptation using Relevance Vector Regression for HMM-based Expressive TTS

被引:0
|
作者
Hong, Doo Hwa [1 ]
Lee, Joun Yeop
Jang, Se Young
Kim, Nam Soo
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul, South Korea
关键词
speech synthesis; speaker adaptation; MLLR; relevance vector regression; LIKELIHOOD LINEAR-REGRESSION; KERNEL REGRESSION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The conventional maximum likelihood linear regression (MLLR)-based adaptation algorithm employed to acoustic hidden Markov models (HMMs) is too restricted in linear regression to represent the details of mapping charateristics. To overcome this problem, we propose the relevance vector regression (RVR)-based model parameter adaptation technique. In this framework, the conventional technique is extended to have much more basis functions. Also, the weights for conducting a transform matrix are obtained by sparse Bayesian learning, in which most of the weights become zero due to the definition of the prior with the precision hyper-parameters. Furthermore, by using the appropriate kernel functions, RVR can take both of the advantages of linear and nonlinear regression. In the experiments, the emotional speech database is used for adaptation to evaluate the proposed method compared with the conventional constrained MLLR. From the experimental results, we conclude that the RVR adaption method performs better than the conventional method.
引用
收藏
页码:1216 / 1220
页数:5
相关论文
共 50 条
  • [41] Speech-rate-variable HMM-based Japanese TTS system
    Iwano, K
    Yamada, M
    Togawa, T
    Furui, S
    PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 219 - 222
  • [42] A style control technique for HMM-based expressive speech synthesis
    Nose, Takashi
    Yamagishi, Junichi
    Masuko, Takashi
    Kobayashi, Takao
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (09) : 1406 - 1413
  • [43] Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR
    Tamura, M
    Masuko, T
    Tokuda, K
    Kobayashi, T
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 805 - 808
  • [44] EFFECTIVE SENTENCE SELECTION BASED ON PHONE/MODEL COVERAGE MAXIMIZATION FOR SPEAKER ADAPTATION IN HMM-BASED SPEECH SYNTHESIS
    Lin, Cheng Hsien
    Huang, Po Kai
    Lin, Cheng Yuan
    Kuo, Chih Chung
    2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 74 - 78
  • [45] HMM-Based Speaker Emotional Recognition Technology for Speech Signal
    Qin, Yuqiang
    Zhang, Xueying
    FRONTIERS OF MANUFACTURING SCIENCE AND MEASURING TECHNOLOGY, PTS 1-3, 2011, 230-232 : 261 - 265
  • [46] A COMPARISON OF SUPERVISED AND UNSUPERVISED CROSS-LINGUAL SPEAKER ADAPTATION APPROACHES FOR HMM-BASED SPEECH SYNTHESIS
    Liang, Hui
    Dines, John
    Saheer, Lakshmi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4598 - 4601
  • [47] An HMM-Based Approach to the INTERSPEECH 2011 Speaker State Challenge
    Nogueiras Rodriguez, Albino
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 3296 - 3299
  • [48] POSTPROCESSOR USING FUZZY VECTOR QUANTIZER IN HMM-BASED SPEECH RECOGNITION
    KIM, HR
    LEE, HS
    ELECTRONICS LETTERS, 1991, 27 (22) : 1998 - 2000
  • [49] HMM-Based Persian Speech Synthesis Using Limited Adaptation Data
    Bahmaninezhad, Fahimeh
    Sameti, Hossein
    Khorram, Soheil
    PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 585 - 589
  • [50] Analysis of speaker clustering strategies for HMM-based speech synthesis
    Dall, Rasmus
    Veaux, Christophe
    Yamagishi, Junichi
    King, Simon
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 994 - 997