Advancements in Expressive Speech Synthesis: a Review

被引:0
|
作者
Alwaisi, Shaimaa [1 ]
Nemeth, Geza [1 ]
机构
[1] Budapest Univ Technol & Econ, Fac Elect Engn & Informat, Dept Telecommun & Media Informat, Budapest, Hungary
来源
INFOCOMMUNICATIONS JOURNAL | 2024年 / 16卷 / 01期
关键词
Speech style; Expressivity; Emotional speech; Expressive TTS; Prosody modification; Multi- lingual and multi- speaker TTS; SPEAKER ADAPTATION; VOICE CONVERSION; TEXT; TTS; MODEL;
D O I
10.36244/ICJ.2024.1.5
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
In recent years, we have witnessed a fast and wide spread acceptance of speech sinthesis technology in, leading to the transition toward a society characterized by a strong desire to incorporate these applications in their daily lives. We provide a comprehensive survey on the recent advancements in the field of expressive Text-To-Speech systems. Among different methods to represent expressivity, this paper facucesthe developmentofax pressive TTS systems, emphasizing the methodologies employed to enhance the quality and expressiveness of synthetic speech, such as style transfer and improving speaker variability. After that, we point out some of the subjective and objective metrics that are used to evaluate the quality of synthesized speech. Fi- nally, we point out the realm of child speech synthesis, a domain that has been neglected for some time. This underscores that the field of research in children's speech synthesis is still wide open for exploration and development. Overall, this paper presents a comprehensive overview of historical and contemporary trends and future directions in speech synthesis research.
引用
收藏
页码:35 / 46
页数:12
相关论文
共 50 条
  • [31] Expressive Speech Animation Synthesis with Phoneme-Level Controls
    Deng, Z.
    Neumann, U.
    COMPUTER GRAPHICS FORUM, 2008, 27 (08) : 2096 - 2113
  • [32] A framework towards expressive speech analysis and synthesis with preliminary results
    Raptis, Spyros
    Karabetsos, Sotiris
    Chalamandaris, Aimilios
    Tsiakoulis, Pirros
    JOURNAL ON MULTIMODAL USER INTERFACES, 2015, 9 (04) : 387 - 394
  • [33] Spoken Dialogue System for Call Centers with Expressive Speech Synthesis
    Nicmanis, Davis
    Salimbajevs, Askars
    INTERSPEECH 2022, 2022, : 5215 - 5218
  • [34] Limited domain synthesis of expressive military speech for animated characters
    Johnson, WL
    Narayanan, S
    Whitney, R
    Das, R
    Bulut, M
    LaBore, C
    PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 163 - 166
  • [35] Expressive Speech Synthesis using Prosodic Modification for Marathi Language
    Anil, Manjare Chandraprabha
    Shirbahadurkar, S. D.
    2ND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN) 2015, 2015, : 126 - 130
  • [36] Pitch Contour Modelling and Modification for Expressive Marathi Speech Synthesis
    Deo, Rohit S.
    Deshpande, Pallavi S.
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 2455 - 2458
  • [37] Contribution to the Design of an Expressive Speech Synthesis System for the Arabic Language
    Demri, Lyes
    Falek, Leila
    Teffahi, Hocine
    SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 178 - 185
  • [38] Can We Generate Emotional Pronunciations for Expressive Speech Synthesis?
    Tahon, Marie
    Lecorve, Gwenole
    Lolive, Damien
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2020, 11 (04) : 684 - 695
  • [39] EMPHATIC SPEECH GENERATION WITH CONDITIONED INPUT LAYER AND BIDIRECTIONAL LSTMS FOR EXPRESSIVE SPEECH SYNTHESIS
    Li, Runnan
    Wu, Zhiyong
    Huang, Yuchen
    Jia, Jia
    Meng, Helen
    Cai, Lianhong
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5129 - 5133
  • [40] Expressive Speech Synthesis via Modeling Expressions with Variational Autoencoder
    Akuzawa, Kei
    Iwasawa, Yusuke
    Matsuo, Yutaka
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3067 - 3071