Developments in corpus-based speech synthesis: Approaching natural conversational speech

被引：19

作者：

Campbell, N ^{[1
]}

机构：

[1] ATR Network Informat Labs, Dept Emergent Commun, Kyoto 6190288, Japan

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2005年 / E88D卷 / 03期

关键词：

speech synthesis; corpora; concatenation; paralinguistic information; communication; affect;

D O I：

10.1093/ietisy/e88-d.3.376

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper describes the special demands of conversational speech in the context of corpus-based speech synthesis. The author proposed the CHATR system of prosody-based unit-selection for concatenative waveform synthesis seven years ago, and now extends this work to incorporate the results of an analysis of five-years of recordings of spontaneous conversational speeech in a wide range of actual daily-life situations. The paper proposes that the expresion of affect (often translated as 'kansei' in Japanese) is the main factor differentiating laboratory speech from real-world conversational speech, and presents a framework for the specification of affect through differences in speaking style and voice quality. Having an enormous corpus of speech samples available for concatenation allows the selection of complete phrase-sized utterance segments, and changes the focus of unit selection from segmental or phonetic continuity to one of prosodic and discoursal appropriateness instead. Samples of the resulting large-corpus-based synthesis can be heard at http://feast.his.atr.jp/AESOP.

引用

页码：376 / 383

页数：8

共 50 条

[1] ANNOTATING CONVERSATIONAL SPEECH FOR CORPUS-BASED DIALOGUE SPEECH SYNTHESIZER - A FIRST STEP
Mori, Hiroki
Hitomi, Takatsugu
2012 INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2012, : 135 - 140
[2] A corpus-based speech synthesis system with emotion
Iida, A
Campbell, N
Higuchi, F
Yasumura, M
SPEECH COMMUNICATION, 2003, 40 (1-2) : 161 - 187
[3] A corpus-based speech synthesis system for Uyghur
Silamu, Wushour
Tursun, Nasirjan
Tursun, Mamateli
RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 373 - 376
[4] Synthesis of everyday conversational speech based on fine-tuning with a corpus for speech synthesis
Mori, Hiroki
Furukawa, Kota
ACOUSTICAL SCIENCE AND TECHNOLOGY, 2025, 46 (01) : 103 - 105
[5] Probabilistic Concatenation Modeling for Corpus-Based Speech Synthesis
Sakai, Shinsuke
Kawahara, Tatsuya
Kawai, Hisashi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (10): : 2006 - 2014
[6] Introduction to Multilingual Corpus-Based Concatenative Speech Synthesis
Deprez, Filip
Odijk, Jan
De Moortel, Jan
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 357 - 360
[7] Segment Connection Networks for Corpus-Based Speech Synthesis
Coorman, Geert
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2074 - 2077
[8] Corpus-based Malay Text-to-Speech Synthesis System
Swee, Tan Tian
Salleh, Sheikh Hussain Shaikh
2008 14TH ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS, (APCC), VOLS 1 AND 2, 2008, : 52 - 56
[9] Maximum Likelihood Unit Selection for Corpus-based Speech Synthesis
Gamboa Rosales, Abubeker
Rosales, Hamurabi Gamboa
Hoffmann, Ruediger
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 748 - +
[10] A method for combining intonation modelling and speech unit selection in corpus-based speech synthesis systems
Diaz, Francisco Campillo
Rodriguez Banga, Eduardo
SPEECH COMMUNICATION, 2006, 48 (08) : 941 - 956

← 1 2 3 4 5 →