On the Construction of Unit Databanks for Text-to-Speech Systems

被引:0
|
作者
Latsch, Vagner L. [1 ]
Netto, Sergio L. [1 ]
机构
[1] UFRJ, COPPE, Elect Engn Program, BR-21941972 Rio De Janeiro, Brazil
关键词
Speech signal processing; speech synthesis; text-to-speech;
D O I
暂无
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
This work deals with one stage in the development of a text-to-speech (TTS) system, which demands a great amount of time and effort, and is strongly related to the resulting speech quality: The determination of the speech-unit databank. For that matter, we present a software tool, the so-called Editor, integrating all major steps in the database determination in a single environment. The whole process includes recording, segmentation, and labeling of speech units to be concatenated in the time domain. The Editor includes a low-cost and precise method for determining the pitch marks, utilizing an auxiliary signal obtained from a contact (throat) microphone. For the phonetic speech labeling, we revise an algorithm for acoustic segmentation, which yields interesting results when proper operation conditions are imposed. The result is a simplified procedure for creating a complete unit database, fully integrated into a single and user-friendly system.
引用
收藏
页码:340 / 343
页数:4
相关论文
共 50 条
  • [31] Constructing text-to-speech systems for languages with unknown pronunciations
    Sawada, Kei
    Hashimoto, Kei
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2018, 39 (02) : 119 - 129
  • [32] Romanian language statistics and resources for text-to-speech systems
    Stan, Adriana
    Giurgiu, Mircea
    2010 9TH INTERNATIONAL SYMPOSIUM ON ELECTRONICS AND TELECOMMUNICATIONS (ISETC), 2010, : 381 - 384
  • [33] EVALUATING TEXT-TO-SPEECH SYSTEMS - SOME METHODOLOGICAL ASPECTS
    VANBEZOOIJEN, R
    POLS, LCW
    SPEECH COMMUNICATION, 1990, 9 (04) : 263 - 270
  • [34] Neural networks in text-to-speech systems for the Greek language
    Falas, T
    Stafylopatis, AG
    MELECON 2000: INFORMATION TECHNOLOGY AND ELECTROTECHNOLOGY FOR THE MEDITERRANEAN COUNTRIES, VOLS 1-3, PROCEEDINGS, 2000, : 574 - 577
  • [35] Syllable duration prediction for Farsi text-to-speech systems
    Nazari, B.
    Nayebi, K.
    Sheikhzadeh, H.
    Scientia Iranica, 2004, 11 (03) : 225 - 233
  • [36] NORMALIZATION OF TEXT MESSAGES FOR TEXT-TO-SPEECH
    Pennell, Deana L.
    Liu, Yang
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4842 - 4845
  • [37] Automatic Speech Recognition Used for Intelligibility Assessment of Text-to-Speech Systems
    Vich, Robert
    Nouza, Jan
    Vondra, Martin
    VERBAL AND NONVERBAL FEATURES OF HUMAN-HUMAN AND HUMAN-MACHINE INTERACTIONS, 2008, 5042 : 136 - +
  • [38] Control of intonation in HMM based text-to-speech systems
    Cai, L. (clh-dcs@tsinghua.edu.cn), 1600, Tsinghua University (53):
  • [39] Text and Speech Corpora for Text-To-Speech Synthesis of Tales
    Doukhan, David
    Rosset, Sophie
    Rilliard, Albert
    d'Alessandro, Christophe
    Adda-Decker, Martine
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1003 - 1010
  • [40] Comparison of measures of speech quality for listening tests of text-to-speech systems
    Viswanathan, M
    Viswanathan, M
    PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 11 - 14