Text-to-Speech Synthesis: Literature Review with an Emphasis on Malayalam Language

被引：0

作者：

Jasir, M. P. ^{[1
]}

Balakrishnan, Kannan ^{[1
]}

机构：

[1] Cochin Univ Sci & Technol, Dept Comp Applicat, Artificial Intelligence Res Lab, Kochi 682022, Kerala, India

来源：

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING | 2022年 / 21卷 / 04期

关键词：

Text to speech synthesis; TTS literature review; Indian language TTS; Malayalam TTS; INDIAN LANGUAGES; SYNTHESIS SYSTEM; NEURAL-NETWORKS; DURATION; ENGLISH; NORMALIZATION; CONSONANTS; TRANSFORMATION; EXTRACTION; CONVERSION;

D O I：

10.1145/3501397

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Text-to-Speech Synthesis (TTS) is an active area of research to generate synthetic speech from underlying text. The identified syllables are uttered with proper duration and prosody characteristics to emulate natural speech. It falls under the category of Natural Language Processing (NLP), which aims to bridge the gap in communication between human and machine. So far as Western languages like English are concerned, the research to produce intelligent and natural synthetic speech has advanced considerably. But in a multilingual state like India, many regional languages viz. Malayalam is underexplored when it comes to NLP. In this article, we try to amalgamate the major research works performed in the area of TTS in English and the prominent Indian languages, with a special emphasis on the South Indian language, Malayalam. This review intends to provide right direction to the research activities in the language, in the area of TTS.

引用

页数：56

共 50 条

[41] Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language
arXiv, 1600,
[42] Investigation of Japanese PnG BERT Language Model in Text-to-Speech Synthesis for Pitch Accent Language
Yasuda, Yusuke
Toda, Tomoki
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (06) : 1319 - 1328
[43] Advancements in Arabic Text-to-Speech Systems: A 22-Year Literature Review
Chemnad, Khansa
Othman, Achraf
IEEE ACCESS, 2023, 11 : 30929 - 30954
[44] A Taiwanese text-to-speech system with applications to language learning
Liang, MS
Yang, RC
Chiang, YC
Lyu, DC
Lyu, RY
IEEE INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES, PROCEEDINGS, 2004, : 91 - 95
[45] Efficient Parsing of Romanian Language for Text-to-Speech Purposes
Saupe, Andrei
Teodorescu, Lucian Radu
Ordean, Mihai Alexandru
Boldizsar, Razvan
Ordean, Mihaela
Silaghi, Gheorghe Cosmin
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2009, 5729 : 323 - +
[46] Romanian language statistics and resources for text-to-speech systems
Stan, Adriana
Giurgiu, Mircea
2010 9TH INTERNATIONAL SYMPOSIUM ON ELECTRONICS AND TELECOMMUNICATIONS (ISETC), 2010, : 381 - 384
[47] A Romanian Language Corpus for a Commercial Text-To-Speech Application
Ordean, Mihai Alexandru
Saupe, Andrei
Ordean, Mihaela
Silaghi, Gheorghe Cosmin
Giurgea, Corina
TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 405 - 414
[48] Neural networks in text-to-speech systems for the Greek language
Falas, T
Stafylopatis, AG
MELECON 2000: INFORMATION TECHNOLOGY AND ELECTROTECHNOLOGY FOR THE MEDITERRANEAN COUNTRIES, VOLS 1-3, PROCEEDINGS, 2000, : 574 - 577
[49] Controlling Emotion in Text-to-Speech with Natural Language Prompts
Bott, Thomas
Lux, Florian
Vu, Ngoc Thang
INTERSPEECH 2024, 2024, : 1795 - 1799
[50] FACTORIZED CONTEXT MODELLING FOR TEXT-TO-SPEECH SYNTHESIS
Lu, Heng
King, Simon
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7849 - 7853

← 1 2 3 4 5 →