A Covariance-Tying Technique for HMM-Based Speech Synthesis

被引:10
|
作者
Oura, Keiichiro [1 ]
Zen, Heiga [1 ]
Nankaku, Yoshihiko [1 ]
Lee, Akinobu [1 ]
Tokuda, Keiichi [1 ]
机构
[1] Nagoya Inst Technol, Dept Comp Sci & Engn, Nagoya, Aichi 4668555, Japan
来源
关键词
HMM; speech synthesis; decision tree; context-clustering; MDL criterion; embedded device;
D O I
10.1587/transinf.E93.D.595
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A technique for reducing the footprints of HMM-based speech synthesis systems by tying all covariance matrices of state distributions is described. HMM-based speech synthesis systems usually leave smaller footprints than unit-selection synthesis systems because they store statistics rather than speech waveforms. However, further reduction is essential to put them on embedded devices, which have limited memory. In accordance with the empirical knowledge that covariance matrices have a smaller impact on the quality of synthesized speech than mean vectors, we propose a technique for clustering mean vectors while tying all covariance matrices. Subjective listening test results showed that the proposed technique can shrink the footprints of an HMM-based speech synthesis system while retaining the quality of the synthesized speech.
引用
收藏
页码:595 / 601
页数:7
相关论文
共 50 条
  • [31] REACTIVE AND CONTINUOUS CONTROL OF HMM-BASED SPEECH SYNTHESIS
    Astrinaki, Maria
    d'Alessandro, Nicolas
    Picart, Benjamin
    Drugman, Thomas
    Dutoit, Thierry
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 252 - 257
  • [32] The Design and Implementation of HMM-based Dai Speech Synthesis
    Wang, Zhan
    Yang, Jian
    Yang, Xin
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [33] DIALOGUE CONTEXT SENSITIVE HMM-BASED SPEECH SYNTHESIS
    Tsiakoulis, Pirros
    Breslin, Catherine
    Gasic, Milica
    Henderson, Matthew
    Kim, Dongho
    Szummer, Martin
    Thomson, Blaise
    Young, Steve
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [34] Evaluation of the Slovenian HMM-based speech synthesis system
    Vesnicer, B
    Mihelic, F
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 513 - 520
  • [35] HMM-based Tibetan Lhasa Speech Synthesis System
    Wu Zhiqiang
    Yu Hongzhi
    Li Guanyu
    Wan Shuhui
    2013 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2013, : 92 - 95
  • [36] Statistical model training technique based on speaker clustering approach for HMM-based speech synthesis
    Ijima, Yusuke
    Miyazaki, Noboru
    Mizuno, Hideyuki
    Sakauchi, Sumitaka
    SPEECH COMMUNICATION, 2015, 71 : 50 - 61
  • [37] Evaluation of speech unit modelling for HMM-based speech synthesis for Arabic
    Houidhek, Amal
    Colotte, Vincent
    Mnasri, Zied
    Jouvet, Denis
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (04) : 895 - 906
  • [38] SPEECH-LAUGHS: AN HMM-BASED APPROACH FOR AMUSED SPEECH SYNTHESIS
    El Haddad, Kevin
    Dupont, Stephane
    Urbain, Jerome
    Dutoit, Thierry
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4939 - 4943
  • [39] Statistical Approaches to Excitation Modeling in HMM-Based Speech Synthesis
    Sung, June Sig
    Hong, Doo Hwa
    Koo, Hyun Woo
    Kim, Nam Soo
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (02): : 379 - 382
  • [40] Evaluation of prosodic contextual factors for HMM-based speech synthesis
    Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Yokohama, 226-8502, Japan
    Proc. Annu. Conf. Int. Speech Commun. Assoc., INTERSPEECH, (430-433):