Advancements in Expressive Speech Synthesis: a Review

被引：0

作者：

Alwaisi, Shaimaa ^{[1
]}

Nemeth, Geza ^{[1
]}

机构：

[1] Budapest Univ Technol & Econ, Fac Elect Engn & Informat, Dept Telecommun & Media Informat, Budapest, Hungary

来源：

INFOCOMMUNICATIONS JOURNAL | 2024年 / 16卷 / 01期

关键词：

Speech style; Expressivity; Emotional speech; Expressive TTS; Prosody modification; Multi- lingual and multi- speaker TTS; SPEAKER ADAPTATION; VOICE CONVERSION; TEXT; TTS; MODEL;

D O I：

10.36244/ICJ.2024.1.5

中图分类号：

TN [电子技术、通信技术];

学科分类号：

0809 ;

摘要：

In recent years, we have witnessed a fast and wide spread acceptance of speech sinthesis technology in, leading to the transition toward a society characterized by a strong desire to incorporate these applications in their daily lives. We provide a comprehensive survey on the recent advancements in the field of expressive Text-To-Speech systems. Among different methods to represent expressivity, this paper facucesthe developmentofax pressive TTS systems, emphasizing the methodologies employed to enhance the quality and expressiveness of synthetic speech, such as style transfer and improving speaker variability. After that, we point out some of the subjective and objective metrics that are used to evaluate the quality of synthesized speech. Fi- nally, we point out the realm of child speech synthesis, a domain that has been neglected for some time. This underscores that the field of research in children's speech synthesis is still wide open for exploration and development. Overall, this paper presents a comprehensive overview of historical and contemporary trends and future directions in speech synthesis research.

引用

页码：35 / 46

页数：12

共 50 条

[21] Expressive Prosody for Unit-selection Speech Synthesis
Strom, Volker
Clark, Robert
King, Simon
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1296 - 1299
[22] JOINT AND ADVERSARIAL TRAINING WITH ASR FOR EXPRESSIVE SPEECH SYNTHESIS
Zhang, Kaili
Gong, Cheng
Lu, Wenhuan
Wang, Longbiao
Wei, Jianguo
Liu, Dawei
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6322 - 6326
[23] SynPaFlex-Corpus: An Expressive French Audiobooks Corpus Dedicated to Expressive Speech Synthesis
Sini, Aghilas
Lolive, Damien
Vidal, Gaelle
Tahon, Marie
Delais-Roussarie, Elisabeth
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4289 - 4296
[24] Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources
Barakat, Huda
Turk, Oytun
Demiroglu, Cenk
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
[25] Expressive Speech Synthesis: Past, Present, and Possible Futures
Schroeder, Marc
AFFECTIVE INFORMATION PROCESSING, 2009, : 111 - 126
[26] Speech Modification for Prosody Conversion in Expressive Marathi Text-to-Speech Synthesis
Anil, Manjare Chandraprabha
Shirbahadurkar, S. D.
2014 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2014, : 56 - 58
[27] Modeling the Acoustic Correlates of Expressive Elements in Text Genres for Expressive Text-to-Speech Synthesis
Yang, Hongwu
Meng, Helen M.
Cai, Lianhong
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1806 - 1809
[28] What type of inputs will we need for expressive speech synthesis?
Campbell, N
PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 95 - 98
[29] Rigid head motion in expressive speech animation: Analysis and synthesis
Busso, Carlos
Deng, Zhigang
Grimm, Michael
Neumann, Ulrich
Narayanan, Shrikanth
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 1075 - 1086
[30] Combining Manual and Automatic Prosodic Annotation for Expressive Speech Synthesis
Brognaux, Sandrine
Francois, Thomas
Saerens, Marco
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3872 - 3879

← 1 2 3 4 5 →