LIMMITS'24: Multi-Speaker, Multi-Lingual INDIC TTS With Voice Cloning

被引:0
|
作者
Udupa, Sathvik [1 ]
Bandekar, Jesuraja [1 ]
Singh, Abhayjeet [1 ]
Deekshitha, G. [1 ]
Kumar, Saurabh [1 ]
Badiger, Sandhya [1 ]
Nagireddi, Amala [1 ]
Roopa, R. [1 ]
Ghosh, Prasanta Kumar [1 ]
Murthy, Hema A. [2 ]
Kumar, Pranaw [3 ]
Tokuda, Keiichi [4 ]
Hasegawa-Johnson, Mark [5 ]
Olbrich, Philipp [6 ]
机构
[1] Indian Inst Sci IISc, Elect Engn Dept, Bangalore 560012, India
[2] Indian Inst Technol, Dept Comp Sci & Engn, Chennai 600036, India
[3] CDAC, Mumbai 400049, India
[4] Nagoya Inst Technol, Dept Comp Sci, Nagoya 4668555, Japan
[5] Univ Illinois, Dept Elect & Comp Engn, Champaign, IL 61820 USA
[6] Deutsch Gesell Internatl Zusammenarbeit GIZ GmbH, D-53113 Bonn, Germany
关键词
Cloning; Multilingual; Signal processing; Training; Text to speech; Noise measurement; Vocabulary; Solid modeling; Manuals; Encoding; Speech synthesis; multi-speaker; multi-lingual TTS; voice cloning; cross-lingual synthesis; SPEECH SYNTHESIS;
D O I
10.1109/OJSP.2025.3531782
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The Multi-speaker, Multi-lingual Indic Text to Speech (TTS) with voice cloning (LIMMITS'24) challenge is organized as part of the ICASSP 2024 signal processing grand challenge. LIMMITS'24 aims at the development of voice cloning for the multi-speaker, multi-lingual Text-to-Speech (TTS) model. Towards this, 80 hours of TTS data has been released in each of Bengali, Chhattisgarhi, English (Indian), and Kannada languages. This is in addition to Telugu, Hindi, and Marathi data released during the LIMMITS'23 challenge. The challenge encourages the advancement of TTS in Indian Languages as well as the development of multi-speaker voice cloning techniques for TTS. The three tracks of LIMMITS'24 have provided an opportunity for various researchers and practitioners around the world to explore the state of the art in research for voice cloning with TTS.
引用
收藏
页码:293 / 302
页数:10
相关论文
共 50 条
  • [31] Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
    Valles-Perez, Ivan
    Roth, Julian
    Beringer, Grzegorz
    Barra-Chicote, Roberto
    Droppo, Jasha
    INTERSPEECH 2021, 2021, : 3131 - 3135
  • [32] Multi-speaker TTS system for low-resource language using cross-lingual transfer learning and data augmentation
    Byambadorj, Zolzaya
    Nishimura, Ryota
    Ayush, Altangerel
    Ohta, Kengo
    Kitaoka, Norihide
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 849 - 853
  • [33] Multi-lingual threading
    Kind, A
    Padget, J
    PROCEEDINGS OF THE SIXTH EUROMICRO WORKSHOP ON PARALLEL AND DISTRIBUTED PROCESSING - PDP '98, 1998, : 431 - 437
  • [34] MULTI-LINGUAL INTERPRETATION
    ROSENNE, S
    ISRAEL LAW REVIEW, 1971, 6 (03) : 360 - 366
  • [35] Zero-shot multi-speaker accent TTS with limited accent data
    Zhang, Mingyang
    Zhou, Yi
    Wu, Zhizheng
    Li, Haizhou
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1931 - 1936
  • [36] MULTI-LINGUAL SCHOLAR
    BOLTON, W
    COMPUTERS AND THE HUMANITIES, 1989, 23 (03): : 263 - 265
  • [37] SYNTHESIZING DYSARTHRIC SPEECH USING MULTI-SPEAKER TTS FOR DYSARTHRIC SPEECH RECOGNITION
    Soleymanpour, Mohammad
    Johnson, Michael T.
    Soleymanpour, Rahim
    Berry, Jeffrey
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7382 - 7386
  • [38] Multi-array multi-speaker tracking
    Potamitis, I
    Tremoulis, G
    Fakotakis, N
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 206 - 213
  • [39] Firefighting in a multi-lingual world
    Anon
    Fire International, 2002, (194):
  • [40] Multi-lingual and multi-cultural conditions
    Arce, CH
    TRANSPORT SURVEY QUALITY AND INNOVATION, 2003, : 209 - 213