Developments in corpus-based speech synthesis: Approaching natural conversational speech

被引:19
|
作者
Campbell, N [1 ]
机构
[1] ATR Network Informat Labs, Dept Emergent Commun, Kyoto 6190288, Japan
来源
关键词
speech synthesis; corpora; concatenation; paralinguistic information; communication; affect;
D O I
10.1093/ietisy/e88-d.3.376
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes the special demands of conversational speech in the context of corpus-based speech synthesis. The author proposed the CHATR system of prosody-based unit-selection for concatenative waveform synthesis seven years ago, and now extends this work to incorporate the results of an analysis of five-years of recordings of spontaneous conversational speeech in a wide range of actual daily-life situations. The paper proposes that the expresion of affect (often translated as 'kansei' in Japanese) is the main factor differentiating laboratory speech from real-world conversational speech, and presents a framework for the specification of affect through differences in speaking style and voice quality. Having an enormous corpus of speech samples available for concatenation allows the selection of complete phrase-sized utterance segments, and changes the focus of unit selection from segmental or phonetic continuity to one of prosodic and discoursal appropriateness instead. Samples of the resulting large-corpus-based synthesis can be heard at http://feast.his.atr.jp/AESOP.
引用
收藏
页码:376 / 383
页数:8
相关论文
共 50 条
  • [21] Fundamental frequency modeling for corpus-based speech synthesis based on a statistical learning technique
    Sakai, S
    Glass, J
    ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 712 - 717
  • [22] Recent progress in corpus-based spontaneous speech recognition
    Furui, S
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (03): : 366 - 375
  • [23] A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese
    Chou, FC
    Tseng, CY
    Lee, LS
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (07): : 481 - 494
  • [24] A corpus-based study of reviewers' usage of speech acts
    Nasser, Marwa Adel
    COGENT ARTS & HUMANITIES, 2022, 9 (01):
  • [25] Corpus-based approaches to the phonological analysis of speech Introduction
    Kubozono, Haruo
    Maekawa, Kikuo
    Vance, Timothy J.
    LABORATORY PHONOLOGY, 2015, 6 (3-4): : 279 - 280
  • [26] A CORPUS-BASED STUDY OF REPAIR CUES IN SPONTANEOUS SPEECH
    NAKATANI, CH
    HIRSCHBERG, J
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 95 (03): : 1603 - 1616
  • [27] Modal Particles in indirect Speech A corpus-based Study
    Thurmair, Maria
    SPRACHWISSENSCHAFT, 2019, 44 (01): : 1 - 72
  • [28] A Corpus-Based Approach to the Study of Speech Act of Thanking
    Cheng, Stephanie W.
    CONCENTRIC-STUDIES IN LINGUISTICS, 2010, 36 (02) : 257 - 274
  • [29] Corpus-based Mandarin speech synthesis with contextual syllabic units based on phonetic properties
    Chou, FC
    Tseng, CY
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 893 - 896
  • [30] Decision Tree-based Training of Probabilistic Concatenation Models for Corpus-based Speech Synthesis
    Sakai, Shinsuke
    Kawahara, Tatsuya
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1746 - 1749