A framework towards expressive speech analysis and synthesis with preliminary results

被引:1
|
作者
Raptis, Spyros [1 ]
Karabetsos, Sotiris [1 ,2 ]
Chalamandaris, Aimilios [1 ]
Tsiakoulis, Pirros [1 ]
机构
[1] Inst Language & Speech Proc, Athena Res Ctr, Voice & Sound Technol Dept, Athens 15125, Greece
[2] Technol Educ Inst TEI Athens, Dept Elect Engn, Athens 12243, Egaleo, Greece
关键词
Emotion classification; Emotional speech; Expressive speech; Text to speech; Acoustic analysis; Speech synthesis; MODELS;
D O I
10.1007/s12193-015-0186-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion-aware computing presents one of the key challenges in contemporary natural human interaction research in which emotional speech is an essential modality in multimodal user interfaces. Speech modality relates mainly to speech emotion and affect recognition as well as near natural expressive speech synthesis, the latter being considered as one of the next significant milestones in speech synthesis technology. A common problem to recognizing as well as to generating affective and emotional speech content is the adopted methodology on emotion analysis and modeling. This work proposes a generalized framework for annotating, analyzing and modeling expressive speech in a data-driven machine learning approach, towards building expressive text to speech synthesis systems. To this end, the framework as well as the data driven methodology is described, comprised of the techniques and approaches for acoustic analysis and expression clustering. In addition, the deployment of online experimental tools for speech perception and annotation and the description of the utilized speech data together with initial experimental results are also given, depicting the potential of the proposed framework and providing encouraging indications for further research.
引用
收藏
页码:387 / 394
页数:8
相关论文
共 50 条
  • [21] Towards an Evaluation Framework for Expressive Stream Reasoning
    Bonte, Pieter
    De Turck, Filip
    Ongenae, Femke
    SEMANTIC WEB: ESWC 2021 SATELLITE EVENTS, 2021, 12739 : 76 - 81
  • [22] Controllable Emphatic Speech Synthesis based on Forward Attention for Expressive Speech Synthesis
    Liu, Liangqi
    Hu, Jiankun
    Wu, Zhiyong
    Yang, Song
    Yang, Songfan
    Jia, Jia
    Meng, Helen
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 410 - 414
  • [23] Prosody modelling of Spanish for expressive speech synthesis
    Iriondo, Ignasi
    Socoro, Joan Claudi
    Alias, Francesc
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 821 - +
  • [24] Specifying affect and emotion for expressive speech synthesis
    Campbell, N
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2004, 2945 : 395 - 406
  • [25] Editorial -: Special section on expressive speech synthesis
    Campbell, Nick
    Hamza, Wael
    Hoege, Harald
    Tao, Jianhua
    Bailly, Gerard
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (04): : 1097 - 1098
  • [26] Expressive speech synthesis using sentiment embeddings
    Jauk, Igor
    Lorenzo-Trueba, Jaime
    Yamagishi, Junichi
    Bonafonte, Antonio
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3062 - 3066
  • [27] Expressive facial speech synthesis on a robotic platform
    Li, Xingyan
    MacDonald, Bruce
    Watson, Catherine I.
    2009 IEEE-RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, 2009, : 5009 - 5014
  • [28] Voice Quality Modelling for Expressive Speech Synthesis
    Monzo, Carlos
    Iriondo, Ignasi
    Socoro, Joan Claudi
    SCIENTIFIC WORLD JOURNAL, 2014,
  • [29] Expressive Latvian Speech Synthesis for Dialog Systems
    Nicmanis, Davis
    Salimbajevs, Askars
    INTERSPEECH 2021, 2021, : 3321 - 3322
  • [30] Towards Scalable Recommendation Framework with Heterogeneous Data Sources: Preliminary Results
    Vo, Nam D.
    Jung, Jason J.
    2018 14TH INTERNATIONAL CONFERENCE ON SIGNAL IMAGE TECHNOLOGY & INTERNET BASED SYSTEMS (SITIS), 2018, : 632 - 636