Understanding and Mitigating the Uncertainty in Zero-Shot Translation

被引:0
|
作者
Wang, Wenxuan [1 ]
Jiao, Wenxiang [2 ]
Wang, Shuo [3 ]
Tu, Zhaopeng [2 ]
Lyu, Michael R. [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong 999077, Peoples R China
[2] Tencent AI Lab, Shenzhen 518057, Peoples R China
[3] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
关键词
Uncertainty; Data models; Training; Training data; Predictive models; Computational modeling; Transformers; Vocabulary; Speech processing; Neural machine translation; zero-shot translation; uncertainty;
D O I
10.1109/TASLP.2024.3485555
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Zero-shottranslation is a promising direction for building a comprehensive multilingual neural machine translation (MNMT) system. However, its quality is still not satisfactory due to off-target issues. In this paper, we aim to understand and alleviate the off-target issues from the perspective of uncertainty in zero-shot translation. By carefully examining the translation output and model confidence, we identify two uncertainties that are responsible for the off-target issues, namely, extrinsic data uncertainty and intrinsic model uncertainty. Based on the observations, we propose two lightweight and complementary approaches to denoise the training data for model training and explicitly penalize the off-target translations by unlikelihood training during model training. Extensive experiments on both balanced and imbalanced datasets show that our approaches significantly improve the performance of zero-shot translation over strong MNMT baselines.
引用
收藏
页码:4894 / 4904
页数:11
相关论文
共 50 条
  • [1] ZeroST: Zero-Shot Speech Translation
    Khurana, Sameer
    Horii, Chiori
    Laurent, Antoine
    Wichern, Gordon
    Le Roux, Jonathan
    INTERSPEECH 2024, 2024, : 392 - 396
  • [2] Towards a Better Understanding of Variations in Zero-Shot Neural Machine Translation Performance
    Tan, Shaomu
    Monz, Christof
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13553 - 13568
  • [3] Zero-shot Image-to-Image Translation
    Parmar, Gaurav
    Singh, Krishna Kumar
    Zhang, Richard
    Li, Yijun
    Lu, Jingwan
    Zhu, Jun-Yan
    PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,
  • [4] Rotation, Translation, and Cropping for Zero-Shot Generalization
    Ye, Chang
    Khalifa, Ahmed
    Bontrager, Philip
    Togelius, Julian
    2020 IEEE CONFERENCE ON GAMES (IEEE COG 2020), 2020, : 57 - 64
  • [5] AlignZeg: Mitigating Objective Misalignment for Zero-Shot Semantic Segmentation
    Ge, Jiannan
    Xie, Lingxi
    Xie, Hongtao
    Li, Pandeng
    Zhang, Xiaopeng
    Zhang, Yongdong
    Tian, Qi
    COMPUTER VISION-ECCV 2024, PT XLIII, 2025, 15101 : 142 - 161
  • [6] Mitigating Generation Shi!s for Generalized Zero-Shot Learning
    Chen, Zhi
    Luo, Yadan
    Wang, Sen
    Qiu, Ruihong
    Li, Jingjing
    Huang, Zi
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 844 - 852
  • [7] Consistency by Agreement in Zero-shot Neural Machine Translation
    Al-Shedivat, Maruan
    Parikh, Ankur P.
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 1184 - 1197
  • [8] Monolingual Adapters for Zero-Shot Neural Machine Translation
    Philip, Jerin
    Berard, Alexandre
    Galle, Matthias
    Besacier, Laurent
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4465 - 4470
  • [9] Improving Zero-Shot Translation by Disentangling Positional Information
    Liu, Danni
    Niehues, Jan
    Cross, James
    Guzman, Francisco
    Li, Xian
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1259 - 1273
  • [10] Incremental Embedding Learning via Zero-Shot Translation
    Wei, Kun
    Deng, Cheng
    Yang, Xu
    Li, Maosen
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10254 - 10262