Expressive Machine Dubbing Through Phrase-level Cross-lingual Prosody Transfer

被引:0
|
作者
Swiatkowski, Jakub [1 ]
Wang, Duo [1 ]
Babianski, Mikolaj [1 ]
Coccia, Giuseppe [1 ]
Tobing, Patrick Lumban [1 ]
Vipperla, Ravichander [1 ]
Klimkov, Viacheslav [1 ]
Pollet, Vincent [1 ]
机构
[1] Amazon Sci, Seattle, WA 98109 USA
来源
关键词
speech synthesis; cross-lingual; prosody transfer; multi-lingual; end-to-end; machine dubbing;
D O I
10.21437/Interspeech.2023-441
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech generation for machine dubbing adds complexity to conventional Text-To-Speech solutions as the generated output is required to match the expressiveness, emotion and speaking rate of the source content. Capturing and transferring details and variations in prosody is a challenge. We introduce phrase-level cross-lingual prosody transfer for expressive multi-lingual machine dubbing. The proposed phrase-level prosody transfer delivers a significant 6.2% MUSHRA score increase over a baseline with utterance-level global prosody transfer, thereby closing the gap between the baseline and expressive human dubbing by 23.2%, while preserving intelligibility of the synthesised speech.
引用
收藏
页码:5546 / 5550
页数:5
相关论文
共 50 条
  • [41] Cross-Lingual Semantic Role Labeling With Model Transfer
    Fei, Hao
    Zhang, Meishan
    Li, Fei
    Ji, Donghong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2427 - 2437
  • [42] Cross-Lingual Transfer for Hindi Discourse Relation Identification
    Dahiya, Anirudh
    Shrivastava, Manish
    Sharma, Dipti Misra
    TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 : 240 - 247
  • [43] Cross-lingual Structure Transfer for Relation and Event Extraction
    Subburathinam, Ananya
    Lu, Di
    Ji, Heng
    May, Jonathan
    Chang, Shih-Fu
    Sil, Avirup
    Voss, Clare
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 313 - 325
  • [44] Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer
    Zhao, Jieyu
    Mukherjee, Subhabrata
    Hosseini, Saghar
    Chang, Kai-Wei
    Awadallah, Ahmed Hassan
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2896 - 2907
  • [45] Isomorphic Transfer of Syntactic Structures in Cross-Lingual NLP
    Ponti, Edoardo Maria
    Reichart, Roi
    Korhonen, Anna
    Vulic, Ivan
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1531 - 1542
  • [46] Monolingual and Cross-Lingual Knowledge Transfer for Topic Classification
    D. Karpov
    M. Burtsev
    Journal of Mathematical Sciences, 2024, 285 (1) : 36 - 48
  • [47] On the Role of Parallel Data in Cross-lingual Transfer Learning
    Reid, Machel
    Artetxe, Mikel
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5999 - 6006
  • [48] CROSS-LINGUAL TRANSFER LEARNING FOR SPOKEN LANGUAGE UNDERSTANDING
    Quynh Ngoc Thi Do
    Gaspers, Judith
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5956 - 5960
  • [49] Cross-Lingual Transfer Learning for Complex Word Identification
    Zaharia, George-Eduard
    Cercel, Dumitru-Clementin
    Dascalu, Mihai
    2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 384 - 390
  • [50] Frustratingly Easy Label Projection for Cross-lingual Transfer
    Chen, Yang
    Jiang, Chao
    Ritter, Alan
    Xu, Wei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5775 - 5796