On the Construction of Unit Databanks for Text-to-Speech Systems

被引：0

作者：

Latsch, Vagner L. ^{[1
]}

Netto, Sergio L. ^{[1
]}

机构：

[1] UFRJ, COPPE, Elect Engn Program, BR-21941972 Rio De Janeiro, Brazil

来源：

PROCEEDINGS OF THE IEEE INTERNATIONAL TELECOMMUNICATIONS SYMPOSIUM, VOLS 1 AND 2 | 2006年

关键词：

Speech signal processing; speech synthesis; text-to-speech;

D O I：

暂无

中图分类号：

TN [电子技术、通信技术];

学科分类号：

0809 ;

摘要：

This work deals with one stage in the development of a text-to-speech (TTS) system, which demands a great amount of time and effort, and is strongly related to the resulting speech quality: The determination of the speech-unit databank. For that matter, we present a software tool, the so-called Editor, integrating all major steps in the database determination in a single environment. The whole process includes recording, segmentation, and labeling of speech units to be concatenated in the time domain. The Editor includes a low-cost and precise method for determining the pitch marks, utilizing an auxiliary signal obtained from a contact (throat) microphone. For the phonetic speech labeling, we revise an algorithm for acoustic segmentation, which yields interesting results when proper operation conditions are imposed. The result is a simplified procedure for creating a complete unit database, fully integrated into a single and user-friendly system.

引用

页码：340 / 343

页数：4

共 50 条

[31] Constructing text-to-speech systems for languages with unknown pronunciations
Sawada, Kei
Hashimoto, Kei
Oura, Keiichiro
Nankaku, Yoshihiko
Tokuda, Keiichi
ACOUSTICAL SCIENCE AND TECHNOLOGY, 2018, 39 (02) : 119 - 129
[32] Romanian language statistics and resources for text-to-speech systems
Stan, Adriana
Giurgiu, Mircea
2010 9TH INTERNATIONAL SYMPOSIUM ON ELECTRONICS AND TELECOMMUNICATIONS (ISETC), 2010, : 381 - 384
[33] EVALUATING TEXT-TO-SPEECH SYSTEMS - SOME METHODOLOGICAL ASPECTS
VANBEZOOIJEN, R
POLS, LCW
SPEECH COMMUNICATION, 1990, 9 (04) : 263 - 270
[34] Neural networks in text-to-speech systems for the Greek language
Falas, T
Stafylopatis, AG
MELECON 2000: INFORMATION TECHNOLOGY AND ELECTROTECHNOLOGY FOR THE MEDITERRANEAN COUNTRIES, VOLS 1-3, PROCEEDINGS, 2000, : 574 - 577
[35] Syllable duration prediction for Farsi text-to-speech systems
Nazari, B.
Nayebi, K.
Sheikhzadeh, H.
Scientia Iranica, 2004, 11 (03) : 225 - 233
[36] NORMALIZATION OF TEXT MESSAGES FOR TEXT-TO-SPEECH
Pennell, Deana L.
Liu, Yang
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4842 - 4845
[37] Automatic Speech Recognition Used for Intelligibility Assessment of Text-to-Speech Systems
Vich, Robert
Nouza, Jan
Vondra, Martin
VERBAL AND NONVERBAL FEATURES OF HUMAN-HUMAN AND HUMAN-MACHINE INTERACTIONS, 2008, 5042 : 136 - +
[38] Control of intonation in HMM based text-to-speech systems
Cai, L. (clh-dcs@tsinghua.edu.cn), 1600, Tsinghua University (53):
[39] Text and Speech Corpora for Text-To-Speech Synthesis of Tales
Doukhan, David
Rosset, Sophie
Rilliard, Albert
d'Alessandro, Christophe
Adda-Decker, Martine
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1003 - 1010
[40] Comparison of measures of speech quality for listening tests of text-to-speech systems
Viswanathan, M
Viswanathan, M
PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 11 - 14

← 1 2 3 4 5 →