LIMMITS'24: Multi-Speaker, Multi-Lingual INDIC TTS With Voice Cloning

被引:0
|
作者
Udupa, Sathvik [1 ]
Bandekar, Jesuraja [1 ]
Singh, Abhayjeet [1 ]
Deekshitha, G. [1 ]
Kumar, Saurabh [1 ]
Badiger, Sandhya [1 ]
Nagireddi, Amala [1 ]
Roopa, R. [1 ]
Ghosh, Prasanta Kumar [1 ]
Murthy, Hema A. [2 ]
Kumar, Pranaw [3 ]
Tokuda, Keiichi [4 ]
Hasegawa-Johnson, Mark [5 ]
Olbrich, Philipp [6 ]
机构
[1] Indian Inst Sci IISc, Elect Engn Dept, Bangalore 560012, India
[2] Indian Inst Technol, Dept Comp Sci & Engn, Chennai 600036, India
[3] CDAC, Mumbai 400049, India
[4] Nagoya Inst Technol, Dept Comp Sci, Nagoya 4668555, Japan
[5] Univ Illinois, Dept Elect & Comp Engn, Champaign, IL 61820 USA
[6] Deutsch Gesell Internatl Zusammenarbeit GIZ GmbH, D-53113 Bonn, Germany
关键词
Cloning; Multilingual; Signal processing; Training; Text to speech; Noise measurement; Vocabulary; Solid modeling; Manuals; Encoding; Speech synthesis; multi-speaker; multi-lingual TTS; voice cloning; cross-lingual synthesis; SPEECH SYNTHESIS;
D O I
10.1109/OJSP.2025.3531782
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The Multi-speaker, Multi-lingual Indic Text to Speech (TTS) with voice cloning (LIMMITS'24) challenge is organized as part of the ICASSP 2024 signal processing grand challenge. LIMMITS'24 aims at the development of voice cloning for the multi-speaker, multi-lingual Text-to-Speech (TTS) model. Towards this, 80 hours of TTS data has been released in each of Bengali, Chhattisgarhi, English (Indian), and Kannada languages. This is in addition to Telugu, Hindi, and Marathi data released during the LIMMITS'23 challenge. The challenge encourages the advancement of TTS in Indian Languages as well as the development of multi-speaker voice cloning techniques for TTS. The three tracks of LIMMITS'24 have provided an opportunity for various researchers and practitioners around the world to explore the state of the art in research for voice cloning with TTS.
引用
收藏
页码:293 / 302
页数:10
相关论文
共 50 条
  • [1] LIMMITS'24: MULTI-SPEAKER, MULTI-LINGUAL INDIC TTS WITH VOICE CLONING<bold> </bold>
    Singh, Abhayjeet
    Nagireddi, Amala
    Deekshitha, G.
    Bandekar, Jesuraja
    Roopa, R.
    Badiger, Sandhya
    Udupa, Sathvik
    Ghosh, Prasanta Kumar
    Murthy, Hema A.
    Kumar, Pranaw
    Tokuda, Keiichi
    Hasegawa-Johnson, Mark
    Olbrich, Philipp
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 61 - 62
  • [2] THE THU-HCSI MULTI-SPEAKER MULTI-LINGUAL FEW-SHOT VOICE CLONING SYSTEM FOR LIMMITS'24 CHALLENGE<bold> </bold>
    Zhou, Yixuan
    Zhou, Shuoyi
    Lei, Shun
    Wu, Zhiyong
    Wu, Menglin
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 71 - 72
  • [3] SCALING NVIDIA'S MULTI-SPEAKER MULTI-LINGUAL TTS SYSTEMS WITH ZERO-SHOT TTS TO INDIC LANGUAGES
    Arora, Akshit
    Badlani, Rohan
    Kim, Sungwon
    Valle, Rafael
    Catanzaro, Bryan
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 115 - 116
  • [4] Lightweight, Multi-Speaker, Multi-Lingual Indic Text-to-Speech
    Singh, Abhayjeet
    Nagireddi, Amala
    Jayakumar, Anjali
    Deekshitha, G.
    Bandekar, Jesuraja
    Roopa, R.
    Badiger, Sandhya
    Udupa, Sathvik
    Kumar, Saurabh
    Ghosh, Prasanta Kumar
    Murthy, Hema A.
    Zen, Heiga
    Kumar, Pranaw
    Kant, Kamal
    Bole, Amol
    Singh, Bira Chandra
    Tokuda, Keiichi
    Hasegawa-Johnson, Mark
    Olbrich, Philipp
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 790 - 798
  • [5] Multi-Lingual Multi-Speaker Text-to-Speech Synthesis for Voice Cloning with Online Speaker Enrollment
    Liu, Zhaoyu
    Mak, Brian
    INTERSPEECH 2020, 2020, : 2932 - 2936
  • [6] LIGHT-TTS: LIGHTWEIGHT MULTI-SPEAKER MULTI-LINGUAL TEXT-TO-SPEECH
    Li, Song
    Ouyang, Beibei
    Li, Lin
    Hong, Qingyang
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8383 - 8387
  • [7] Narrator or Character: Voice Modulation in an Expressive Multi-speaker TTS
    Kalyan, T. Pavan
    Rao, Preeti
    Jyothi, Preethi
    Bhattacharyya, Pushpak
    INTERSPEECH 2023, 2023, : 4808 - 4812
  • [8] THE MULTI-SPEAKER MULTI-STYLE VOICE CLONING CHALLENGE 2021
    Xie, Qicong
    Tian, Xiaohai
    Liu, Guanghou
    Song, Kun
    Xie, Lei
    Wu, Zhiyong
    Li, Hai
    Shi, Song
    Li, Haizhou
    Hong, Fen
    Bu, Hui
    Xu, Xin
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8613 - 8617
  • [9] CAN WE USE COMMON VOICE TO TRAIN A MULTI-SPEAKER TTS SYSTEM?
    Ogun, Sewade
    Colotte, Vincent
    Vincent, Emmanuel
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 900 - 905
  • [10] A Controllable Multi-Lingual Multi-Speaker Multi-Style Text-to-Speech Synthesis With Multivariate Information Minimization
    Cheon, Sung Jun
    Choi, Byoung Jin
    Kim, Minchan
    Lee, Hyeonseung
    Kim, Nam Soo
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 55 - 59