Speaker Adaptation using Relevance Vector Regression for HMM-based Expressive TTS

被引:0
|
作者
Hong, Doo Hwa [1 ]
Lee, Joun Yeop
Jang, Se Young
Kim, Nam Soo
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul, South Korea
关键词
speech synthesis; speaker adaptation; MLLR; relevance vector regression; LIKELIHOOD LINEAR-REGRESSION; KERNEL REGRESSION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The conventional maximum likelihood linear regression (MLLR)-based adaptation algorithm employed to acoustic hidden Markov models (HMMs) is too restricted in linear regression to represent the details of mapping charateristics. To overcome this problem, we propose the relevance vector regression (RVR)-based model parameter adaptation technique. In this framework, the conventional technique is extended to have much more basis functions. Also, the weights for conducting a transform matrix are obtained by sparse Bayesian learning, in which most of the weights become zero due to the definition of the prior with the precision hyper-parameters. Furthermore, by using the appropriate kernel functions, RVR can take both of the advantages of linear and nonlinear regression. In the experiments, the emotional speech database is used for adaptation to evaluate the proposed method compared with the conventional constrained MLLR. From the experimental results, we conclude that the RVR adaption method performs better than the conventional method.
引用
收藏
页码:1216 / 1220
页数:5
相关论文
共 50 条
  • [21] Speaker and style adaptation using average voice model for style control in HMM-based speech synthesis
    Tachibana, Makoto
    Izawa, Shinsuke
    Nose, Takashi
    Kobayashi, Takao
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4633 - 4636
  • [22] Speaker adaptation method for acoustic-to-articulatory inversion using an HMM-based speech production model
    Hiroya, S
    Honda, M
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (05): : 1071 - 1078
  • [23] HMM-based Speaker Characteristics Emphasis Using Average Voice Model
    Nose, Takashi
    Adada, Junichi
    Kobayashi, Takao
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2599 - 2602
  • [24] HMM-based TTS for Hanoi Vietnamese: issues in design and evaluation
    Nguyen Thi Thu Trang
    D'Alessandro, Christophe
    Rilliard, Albert
    Tran Do Dat
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2310 - 2314
  • [25] Rich Context Modeling for High Quality HMM-Based TTS
    Yan, Zhi-Jie
    Qian, Yao
    Soong, Frank K.
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1767 - 1770
  • [26] Speaker state recognition using an HMM-based feature extraction method
    Gajsek, R.
    Mihelic, F.
    Dobrisek, S.
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (01): : 135 - 150
  • [27] Using Eigenvoices and Nearest-Neighbors in HMM-Based Cross-Lingual Speaker Adaptation With Limited Data
    Sarfjoo, Seyyed Saeed
    Demiroglu, Cenk
    King, Simon
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 839 - 851
  • [28] An acoustic model adaptation using hmm-based speech synthesis
    Tanaka, K
    Kuroiwa, S
    Tsuge, S
    Ren, F
    2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 368 - 373
  • [29] A hybrid score measurement for HMM-based speaker verification
    Gu, Y
    Thomas, T
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 317 - 320
  • [30] Speaker interpolation for HMM-based speech synthesis system
    Yoshimura, Takayoshi, 2000, Acoustical Soc Jpn, Tokyo, Japan (21):