LIMMITS'24: Multi-Speaker, Multi-Lingual INDIC TTS With Voice Cloning

被引：0

作者：

Udupa, Sathvik ^{[1
]}

Bandekar, Jesuraja ^{[1
]}

Singh, Abhayjeet ^{[1
]}

Deekshitha, G. ^{[1
]}

Kumar, Saurabh ^{[1
]}

Badiger, Sandhya ^{[1
]}

Nagireddi, Amala ^{[1
]}

Roopa, R. ^{[1
]}

Ghosh, Prasanta Kumar ^{[1
]}

Murthy, Hema A. ^{[2
]}

Kumar, Pranaw ^{[3
]}

Tokuda, Keiichi ^{[4
]}

Hasegawa-Johnson, Mark ^{[5
]}

Olbrich, Philipp ^{[6
]}

机构：

[1] Indian Inst Sci IISc, Elect Engn Dept, Bangalore 560012, India

[2] Indian Inst Technol, Dept Comp Sci & Engn, Chennai 600036, India

[3] CDAC, Mumbai 400049, India

[4] Nagoya Inst Technol, Dept Comp Sci, Nagoya 4668555, Japan

[5] Univ Illinois, Dept Elect & Comp Engn, Champaign, IL 61820 USA

[6] Deutsch Gesell Internatl Zusammenarbeit GIZ GmbH, D-53113 Bonn, Germany

来源：

IEEE OPEN JOURNAL OF SIGNAL PROCESSING | 2025年 / 6卷

关键词：

Cloning; Multilingual; Signal processing; Training; Text to speech; Noise measurement; Vocabulary; Solid modeling; Manuals; Encoding; Speech synthesis; multi-speaker; multi-lingual TTS; voice cloning; cross-lingual synthesis; SPEECH SYNTHESIS;

D O I：

10.1109/OJSP.2025.3531782

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The Multi-speaker, Multi-lingual Indic Text to Speech (TTS) with voice cloning (LIMMITS'24) challenge is organized as part of the ICASSP 2024 signal processing grand challenge. LIMMITS'24 aims at the development of voice cloning for the multi-speaker, multi-lingual Text-to-Speech (TTS) model. Towards this, 80 hours of TTS data has been released in each of Bengali, Chhattisgarhi, English (Indian), and Kannada languages. This is in addition to Telugu, Hindi, and Marathi data released during the LIMMITS'23 challenge. The challenge encourages the advancement of TTS in Indian Languages as well as the development of multi-speaker voice cloning techniques for TTS. The three tracks of LIMMITS'24 have provided an opportunity for various researchers and practitioners around the world to explore the state of the art in research for voice cloning with TTS.

引用

页码：293 / 302

页数：10

共 50 条

[1] LIMMITS'24: MULTI-SPEAKER, MULTI-LINGUAL INDIC TTS WITH VOICE CLONING<bold> </bold>
Singh, Abhayjeet
Nagireddi, Amala
Deekshitha, G.
Bandekar, Jesuraja
Roopa, R.
Badiger, Sandhya
Udupa, Sathvik
Ghosh, Prasanta Kumar
Murthy, Hema A.
Kumar, Pranaw
Tokuda, Keiichi
Hasegawa-Johnson, Mark
Olbrich, Philipp
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 61 - 62
[2] THE THU-HCSI MULTI-SPEAKER MULTI-LINGUAL FEW-SHOT VOICE CLONING SYSTEM FOR LIMMITS'24 CHALLENGE<bold> </bold>
Zhou, Yixuan
Zhou, Shuoyi
Lei, Shun
Wu, Zhiyong
Wu, Menglin
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 71 - 72
[3] SCALING NVIDIA'S MULTI-SPEAKER MULTI-LINGUAL TTS SYSTEMS WITH ZERO-SHOT TTS TO INDIC LANGUAGES
Arora, Akshit
Badlani, Rohan
Kim, Sungwon
Valle, Rafael
Catanzaro, Bryan
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 115 - 116
[4] Lightweight, Multi-Speaker, Multi-Lingual Indic Text-to-Speech
Singh, Abhayjeet
Nagireddi, Amala
Jayakumar, Anjali
Deekshitha, G.
Bandekar, Jesuraja
Roopa, R.
Badiger, Sandhya
Udupa, Sathvik
Kumar, Saurabh
Ghosh, Prasanta Kumar
Murthy, Hema A.
Zen, Heiga
Kumar, Pranaw
Kant, Kamal
Bole, Amol
Singh, Bira Chandra
Tokuda, Keiichi
Hasegawa-Johnson, Mark
Olbrich, Philipp
IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 790 - 798
[5] Multi-Lingual Multi-Speaker Text-to-Speech Synthesis for Voice Cloning with Online Speaker Enrollment
Liu, Zhaoyu
Mak, Brian
INTERSPEECH 2020, 2020, : 2932 - 2936
[6] LIGHT-TTS: LIGHTWEIGHT MULTI-SPEAKER MULTI-LINGUAL TEXT-TO-SPEECH
Li, Song
Ouyang, Beibei
Li, Lin
Hong, Qingyang
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8383 - 8387
[7] Narrator or Character: Voice Modulation in an Expressive Multi-speaker TTS
Kalyan, T. Pavan
Rao, Preeti
Jyothi, Preethi
Bhattacharyya, Pushpak
INTERSPEECH 2023, 2023, : 4808 - 4812
[8] THE MULTI-SPEAKER MULTI-STYLE VOICE CLONING CHALLENGE 2021
Xie, Qicong
Tian, Xiaohai
Liu, Guanghou
Song, Kun
Xie, Lei
Wu, Zhiyong
Li, Hai
Shi, Song
Li, Haizhou
Hong, Fen
Bu, Hui
Xu, Xin
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8613 - 8617
[9] CAN WE USE COMMON VOICE TO TRAIN A MULTI-SPEAKER TTS SYSTEM?
Ogun, Sewade
Colotte, Vincent
Vincent, Emmanuel
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 900 - 905
[10] A Controllable Multi-Lingual Multi-Speaker Multi-Style Text-to-Speech Synthesis With Multivariate Information Minimization
Cheon, Sung Jun
Choi, Byoung Jin
Kim, Minchan
Lee, Hyeonseung
Kim, Nam Soo
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 55 - 59

← 1 2 3 4 5 →