Statistical Approaches to Excitation Modeling in HMM-Based Speech Synthesis

被引:2
|
作者
Sung, June Sig [1 ,2 ]
Hong, Doo Hwa [1 ,2 ]
Koo, Hyun Woo [1 ,2 ]
Kim, Nam Soo [1 ,2 ]
机构
[1] Seoul Natl Univ, Sch Elect Engn, Seoul 151742, South Korea
[2] Seoul Natl Univ, Inst New Media & Commun, Seoul 151742, South Korea
来源
基金
新加坡国家研究基金会;
关键词
HMM-based speech synthesis; waveform interpolation; principal component analysis; non-negative matrix factorization;
D O I
10.1587/transinf.E96.D.379
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In our previous study, we proposed the waveform interpolation (WI) approach to model the excitation signals for hidden Markov model (HMM)-based speech synthesis. This letter presents several techniques to improve excitation modeling within the WI framework. We propose both the time domain and frequency domain zero padding techniques to reduce the spectral distortion inherent in the synthesized excitation signal. Furthermore, we apply non-negative matrix factorization (NMF) to obtain a low-dimensional representation of the excitation signals. From a number of experiments, including a subjective listening test, the proposed method has been found to enhance the performance of the conventional excitation modeling techniques.
引用
收藏
页码:379 / 382
页数:4
相关论文
共 50 条
  • [1] Excitation Modeling Based on Waveform Interpolation for HMM-based Speech Synthesis
    Sung, June Sig
    Hong, Doo Hwa
    Oh, Kyung Hwan
    Kim, Nam Soo
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 813 - 816
  • [2] Excitation Modeling for HMM-based Speech Synthesis Based on Principal Component Analysis
    Narendra, N. P.
    Reddy, M. Kiran
    Rao, K. Sreenivasa
    2016 TWENTY SECOND NATIONAL CONFERENCE ON COMMUNICATION (NCC), 2016,
  • [3] Excitation Modeling Method Based on Inverse Filtering for HMM-Based Speech Synthesis
    Reddy, M. Kiran
    Rao, K. Sreenivasa
    MACHINE INTELLIGENCE AND SIGNAL ANALYSIS, 2019, 748 : 85 - 91
  • [4] A trainable excitation model for HMM-based speech synthesis
    Maia, R.
    Toda, T.
    Zen, H.
    Nankaku, Y.
    Tokuda, K.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1125 - +
  • [5] State duration modeling for HMM-based speech synthesis
    Zen, Heiga
    Masuko, Takashi
    Tokuda, Keiichi
    Yoshimura, Takayoshi
    Kobayasih, Takao
    Kitamura, Tadashi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (03): : 692 - 693
  • [6] Two-band excitation for HMM-based speech synthesis
    Kim, Sang-Jin
    Hahn, Minsoo
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (01) : 378 - 381
  • [7] Improved Training of Excitation for HMM-based Parametric Speech Synthesis
    Shiga, Yoshinori
    Toda, Tomoki
    Sakai, Shinsuke
    Kawai, Hisashi
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 809 - 812
  • [8] Amplitude Spectrum based Excitation Model for HMM-based Speech Synthesis
    Wen, Zhengqi
    Tao, Jianhua
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1426 - 1429
  • [9] A Comparison of Two Approaches to Bilingual HMM-Based Speech Synthesis
    Pobar, Miran
    Justin, Tadej
    Zibert, Janez
    Mihelic, France
    Ipsic, Ivo
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 44 - 51
  • [10] EXCITATION MODELING FOR HMM-BASED SPEECH SYNTHESIS: BREAKING DOWN THE IMPACT OF PERIODIC AND APERIODIC COMPONENTS
    Drugman, Thomas
    Raitio, Tuomo
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,