Speaker-specific retraining for enhanced compression of unit selection text-to-speech databases

被引:0
|
作者
Nurminen, Jani [1 ]
Silen, Hanna [1 ]
Gabbouj, Moncef [1 ]
机构
[1] Tampere Univ Technol, Dept Signal Proc, Tampere, Finland
关键词
speech synthesis; unit selection; database compression; LPC PARAMETERS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unit selection based text-to-speech systems can generally obtain high speech quality provided that the database is large enough. In embedded applications, the related memory requirements may be excessive and often the database needs to be both pruned and compressed to fit it into the available memory space. In this paper, we study the topic of database compression. In particular, the focus is on speaker-specific optimization of the quantizers used in the database compression. First, we introduce the simple concept of dynamic quantizer structures, facilitating the use of speaker-specific optimizations by enabling convenient run-time updates. Second, we show that significant memory savings can be obtained through speaker-specific retraining while perfectly maintaining the quantization accuracy, even when the memory required for the additional codebook data is taken into account. Thus, the proposed approach can be considered effective in reducing the conventionally large footprint of unit selection based text-to-speech systems.
引用
收藏
页码:388 / 391
页数:4
相关论文
共 50 条
  • [41] Govorec(Speaker) Slovenian text-to-speech system for telecommunication applications
    Sef, T
    Gams, M
    2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 504 - 507
  • [42] Govorec (Speaker) - Slovenian text-to-speech synthesizer for various applications
    Sef, T
    Gams, M
    6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL III, PROCEEDINGS: IMAGE, ACOUSTIC, SPEECH AND SIGNAL PROCESSING I, 2002, : 270 - 275
  • [43] Automatic prosodic modeling for speaker and task adaptation in text-to-speech
    LopezGonzalo, E
    RodriguezGarcia, JM
    HernandezGomez, L
    Villar, JM
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 927 - 930
  • [44] A Small Footprint Hybrid Statistical and Unit Selection Text-to-Speech Synthesis System for Turkish
    Guner, Ekrem
    Demiroglu, Cenk
    COMPUTER AND INFORMATION SCIENCES II, 2012, : 85 - 91
  • [45] RECENT IMPROVEMENTS OF PROBABILITY BASED PROSODY MODELS FOR UNIT SELECTION IN CONCATENATIVE TEXT-TO-SPEECH
    Zhang, Wei
    Gu, Liang
    Gao, Yuqing
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3777 - 3780
  • [46] Enhanced quality text-to-speech for restricted domains
    不详
    BELL LABS TECHNICAL JOURNAL, 1997, 2 (04) : 169 - 170
  • [47] Refining Unit Boundaries for Mandarin Text-to-Speech Database
    Dong, Minghui
    Cen, Ling
    Chan, Paul
    Li, Haizhou
    2009 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2009, : 245 - 248
  • [48] Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora
    Luong, Hieu-Thi
    Wang, Xin
    Yamagishi, Junichi
    Nishizawa, Nobuyuki
    INTERSPEECH 2019, 2019, : 1303 - 1307
  • [49] Deep Voice 2: Multi-Speaker Neural Text-to-Speech
    Arik, Sercan O.
    Diamos, Gregory
    Gibiansky, Andrew
    Miller, John
    Peng, Kainan
    Ping, Wei
    Raiman, Jonathan
    Zhou, Yanqi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [50] Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
    Jia, Ye
    Zhang, Yu
    Weiss, Ron J.
    Wang, Quan
    Shen, Jonathan
    Ren, Fei
    Chen, Zhifeng
    Nguyen, Patrick
    Pang, Ruoming
    Moreno, Ignacio Lopez
    Wu, Yonghui
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31