Speaker Adaptation using Nonlinear Regression Techniques for HMM-based Speech Synthesis

被引:0
|
作者
Hong, Doo Hwa [1 ]
Kang, Shin Jae
Lee, Joun Yeop
Kim, Nam Soo
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul 151742, South Korea
关键词
maximum likelihood linear regression (MLLR); HMM-based speech synthesis; kernel; maximum penalized likelihood kernel regression (MPLKR); LIKELIHOOD LINEAR-REGRESSION; KERNEL REGRESSION;
D O I
10.1109/IIH-MSP.2014.152
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The maximum likelihood linear regression (MLLR) technique is a well-known approach to parameter adaptation in hidden Markov model (HMM)-based systems. In this paper, we propose the maximum penalized likelihood kernel regression (MPLKR) approach as a novel adaptation technique for HMM-based speech synthesis. The proposed algorithm performs a nonlinear regression between the mean vector of the base model and the corresponding mean vector of adaptive data by means of a kernel method. In the experiments, we used various types of parametric kernels for the proposed algorithm and compared their performances with the conventional method. From experimental results, it has been found that the proposed algorithm outperforms the conventional method in terms of the objective measure as well as the subjective listening quality.
引用
收藏
页码:586 / 589
页数:4
相关论文
共 50 条
  • [21] Minimum generation error linear regression based model adaptation for HMM-based speech synthesis
    Qin, Long
    Wu, Yi-Jian
    Ling, Zhen-Hua
    Wang, Ren-Hua
    Da, Li-Rong
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3953 - +
  • [22] EVALUATION OF LINEAR REGRESSION FOR SPEAKER ADAPTATION IN HMM-BASED ARTICULATORY MOVEMENTS ESTIMATION
    Li, Hao
    Tao, Jianhua
    Wang, Yang
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4944 - 4948
  • [23] Speaker and Language Adaptive Training for HMM-Based Polyglot Speech Synthesis
    Zen, Heiga
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 410 - 413
  • [24] A COMPARISON OF SUPERVISED AND UNSUPERVISED CROSS-LINGUAL SPEAKER ADAPTATION APPROACHES FOR HMM-BASED SPEECH SYNTHESIS
    Liang, Hui
    Dines, John
    Saheer, Lakshmi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4598 - 4601
  • [25] FACTORED MLLR ADAPTATION FOR HMM-BASED EXPRESSIVE SPEECH SYNTHESIS
    Sung, June Sig
    Hong, Doo Hwa
    Lee, Chul Min
    Kim, Nam Soo
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 974 - 977
  • [26] Data Selection and Adaptation for Naturalness in HMM-based Speech Synthesis
    Cooper, Erica
    Chang, Alison
    Levitan, Yocheved
    Hirschberg, Julia
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 357 - +
  • [27] Emotion transplantation through adaptation in HMM-based speech synthesis
    Lorenzo-Trueba, Jaime
    Barra-Chicote, Roberto
    San-Segundo, Ruben
    Ferreiros, Javier
    Yamagishi, Junichi
    Montero, Juan M.
    COMPUTER SPEECH AND LANGUAGE, 2015, 34 (01): : 292 - 307
  • [28] EFFECTIVE SENTENCE SELECTION BASED ON PHONE/MODEL COVERAGE MAXIMIZATION FOR SPEAKER ADAPTATION IN HMM-BASED SPEECH SYNTHESIS
    Lin, Cheng Hsien
    Huang, Po Kai
    Lin, Cheng Yuan
    Kuo, Chih Chung
    2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 74 - 78
  • [29] Czech HMM-Based Speech Synthesis: Experiments with Model Adaptation
    Hanzlicek, Zdenek
    TEXT, SPEECH AND DIALOGUE, TSD 2011, 2011, 6836 : 107 - 114
  • [30] Speaker adaptation method for acoustic-to-articulatory inversion using an HMM-based speech production model
    Hiroya, Sadao
    Honda, Masaaki
    IEICE Transactions on Information and Systems, 2004, E87-D (05) : 1071 - 1078