Developments in corpus-based speech synthesis: Approaching natural conversational speech

被引:19
|
作者
Campbell, N [1 ]
机构
[1] ATR Network Informat Labs, Dept Emergent Commun, Kyoto 6190288, Japan
来源
关键词
speech synthesis; corpora; concatenation; paralinguistic information; communication; affect;
D O I
10.1093/ietisy/e88-d.3.376
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes the special demands of conversational speech in the context of corpus-based speech synthesis. The author proposed the CHATR system of prosody-based unit-selection for concatenative waveform synthesis seven years ago, and now extends this work to incorporate the results of an analysis of five-years of recordings of spontaneous conversational speeech in a wide range of actual daily-life situations. The paper proposes that the expresion of affect (often translated as 'kansei' in Japanese) is the main factor differentiating laboratory speech from real-world conversational speech, and presents a framework for the specification of affect through differences in speaking style and voice quality. Having an enormous corpus of speech samples available for concatenation allows the selection of complete phrase-sized utterance segments, and changes the focus of unit selection from segmental or phonetic continuity to one of prosodic and discoursal appropriateness instead. Samples of the resulting large-corpus-based synthesis can be heard at http://feast.his.atr.jp/AESOP.
引用
收藏
页码:376 / 383
页数:8
相关论文
共 50 条
  • [1] ANNOTATING CONVERSATIONAL SPEECH FOR CORPUS-BASED DIALOGUE SPEECH SYNTHESIZER - A FIRST STEP
    Mori, Hiroki
    Hitomi, Takatsugu
    2012 INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2012, : 135 - 140
  • [2] A corpus-based speech synthesis system with emotion
    Iida, A
    Campbell, N
    Higuchi, F
    Yasumura, M
    SPEECH COMMUNICATION, 2003, 40 (1-2) : 161 - 187
  • [3] A corpus-based speech synthesis system for Uyghur
    Silamu, Wushour
    Tursun, Nasirjan
    Tursun, Mamateli
    RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 373 - 376
  • [4] Synthesis of everyday conversational speech based on fine-tuning with a corpus for speech synthesis
    Mori, Hiroki
    Furukawa, Kota
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2025, 46 (01) : 103 - 105
  • [5] Probabilistic Concatenation Modeling for Corpus-Based Speech Synthesis
    Sakai, Shinsuke
    Kawahara, Tatsuya
    Kawai, Hisashi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (10): : 2006 - 2014
  • [6] Introduction to Multilingual Corpus-Based Concatenative Speech Synthesis
    Deprez, Filip
    Odijk, Jan
    De Moortel, Jan
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 357 - 360
  • [7] Segment Connection Networks for Corpus-Based Speech Synthesis
    Coorman, Geert
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2074 - 2077
  • [8] Corpus-based Malay Text-to-Speech Synthesis System
    Swee, Tan Tian
    Salleh, Sheikh Hussain Shaikh
    2008 14TH ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS, (APCC), VOLS 1 AND 2, 2008, : 52 - 56
  • [9] Maximum Likelihood Unit Selection for Corpus-based Speech Synthesis
    Gamboa Rosales, Abubeker
    Rosales, Hamurabi Gamboa
    Hoffmann, Ruediger
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 748 - +
  • [10] A method for combining intonation modelling and speech unit selection in corpus-based speech synthesis systems
    Diaz, Francisco Campillo
    Rodriguez Banga, Eduardo
    SPEECH COMMUNICATION, 2006, 48 (08) : 941 - 956