共 50 条
- [41] Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation INTERSPEECH 2020, 2020, : 3191 - 3195
- [42] XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model INTERSPEECH 2024, 2024, : 4978 - 4982
- [43] EXACT PROSODY CLONING IN ZERO-SHOT MULTISPEAKER TEXT-TO-SPEECH 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 962 - 969
- [44] MnTTS2: An Open-Source Multi-speaker Mongolian Text-to-Speech Synthesis Dataset MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2022, 2023, 1765 : 318 - 329
- [46] Generalizable Zero-Shot Speaker Adaptive Speech Synthesis with Disentangled Representations INTERSPEECH 2023, 2023, : 4454 - 4458
- [47] Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration INTERSPEECH 2021, 2021, : 3600 - 3604
- [48] SCALING NVIDIA'S MULTI-SPEAKER MULTI-LINGUAL TTS SYSTEMS WITH ZERO-SHOT TTS TO INDIC LANGUAGES 2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 115 - 116
- [49] Speaker Specific Phrase Break Modeling with Conditional Random Fields for Text-to-Speech 2016 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS INTERNATIONAL CONFERENCE (PRASA-ROBMECH), 2016,
- [50] Zero-Shot Voice Cloning Text-to-Speech for Dysphonia Disorder Speakers IEEE ACCESS, 2024, 12 : 63528 - 63547