On the Construction of Unit Databanks for Text-to-Speech Systems

被引：0

作者：

Latsch, Vagner L. ^{[1
]}

Netto, Sergio L. ^{[1
]}

机构：

[1] UFRJ, COPPE, Elect Engn Program, BR-21941972 Rio De Janeiro, Brazil

来源：

PROCEEDINGS OF THE IEEE INTERNATIONAL TELECOMMUNICATIONS SYMPOSIUM, VOLS 1 AND 2 | 2006年

关键词：

Speech signal processing; speech synthesis; text-to-speech;

D O I：

暂无

中图分类号：

TN [电子技术、通信技术];

学科分类号：

0809 ;

摘要：

This work deals with one stage in the development of a text-to-speech (TTS) system, which demands a great amount of time and effort, and is strongly related to the resulting speech quality: The determination of the speech-unit databank. For that matter, we present a software tool, the so-called Editor, integrating all major steps in the database determination in a single environment. The whole process includes recording, segmentation, and labeling of speech units to be concatenated in the time domain. The Editor includes a low-cost and precise method for determining the pitch marks, utilizing an auxiliary signal obtained from a contact (throat) microphone. For the phonetic speech labeling, we revise an algorithm for acoustic segmentation, which yields interesting results when proper operation conditions are imposed. The result is a simplified procedure for creating a complete unit database, fully integrated into a single and user-friendly system.

引用

页码：340 / 343

页数：4

共 50 条

[41] INTELLIGIBILITY OF SPEECH PRODUCED BY TEXT-TO-SPEECH SYSTEMS IN GOOD AND TELEPHONIC CONDITIONS
DELOGU, C
PAOLONI, A
RIDOLFI, P
VAGGES, K
ACTA ACUSTICA, 1995, 3 (01): : 89 - 96
[42] Including Pitch Accent Optionality in Unit Selection Text-to-Speech Synthesis
Badino, Leonardo
Clark, Robert A. J.
Strom, Volker
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2118 - 2121
[43] A Dynamic Cost Weighting Framework for Unit Selection Text-to-Speech Synthesis
Bellegarda, Jerome R.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1455 - 1463
[44] Diphone-based unit selection for Catalan text-to-speech synthesis
Guaus, R
Iriondo, I
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2000, 1902 : 277 - 282
[45] High quality Arabic text-to-speech synthesis using unit selection
Abdelmalek, Raja
Mnasri, Zied
2016 13TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2016, : 1 - 5
[46] Syllable specific unit selection cost functions for text-to-speech synthesis
Narendra, N.P.
Sreenivasa Rao, K.
ACM Transactions on Speech and Language Processing, 2012, 9 (03):
[47] JAPANESE TEXT-TO-SPEECH SYNTHESIZER
NAGAKURA, K
HAKODA, K
KABEYA, K
REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1988, 36 (05): : 451 - 457
[48] Multilingual text-to-speech synthesis
Black, AW
Lenzo, KA
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 761 - 764
[49] Slovenian text-to-speech system
Sef, T
ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL V: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 41 - 44
[50] A Hakka text-to-speech system
Yu, Hsiu-Min
Hwang, Hsin-Te
Lin, Dong-Yi
Chen, Sin-Horng
CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 241 - +

← 1 2 3 4 5 →