Understanding and Mitigating the Uncertainty in Zero-Shot Translation

被引:0
|
作者
Wang, Wenxuan [1 ]
Jiao, Wenxiang [2 ]
Wang, Shuo [3 ]
Tu, Zhaopeng [2 ]
Lyu, Michael R. [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong 999077, Peoples R China
[2] Tencent AI Lab, Shenzhen 518057, Peoples R China
[3] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
关键词
Uncertainty; Data models; Training; Training data; Predictive models; Computational modeling; Transformers; Vocabulary; Speech processing; Neural machine translation; zero-shot translation; uncertainty;
D O I
10.1109/TASLP.2024.3485555
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Zero-shottranslation is a promising direction for building a comprehensive multilingual neural machine translation (MNMT) system. However, its quality is still not satisfactory due to off-target issues. In this paper, we aim to understand and alleviate the off-target issues from the perspective of uncertainty in zero-shot translation. By carefully examining the translation output and model confidence, we identify two uncertainties that are responsible for the off-target issues, namely, extrinsic data uncertainty and intrinsic model uncertainty. Based on the observations, we propose two lightweight and complementary approaches to denoise the training data for model training and explicitly penalize the off-target translations by unlikelihood training during model training. Extensive experiments on both balanced and imbalanced datasets show that our approaches significantly improve the performance of zero-shot translation over strong MNMT baselines.
引用
收藏
页码:4894 / 4904
页数:11
相关论文
共 50 条
  • [31] Zero-VIRUS*: Zero-shot VehIcle Route Understanding System for Intelligent Transportation
    Yu, Lijun
    Feng, Qianyu
    Qian, Yijun
    Liu, Wenhe
    Hauptmann, Alexander G.
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 2534 - 2543
  • [32] (ALMOST) ZERO-SHOT CROSS-LINGUAL SPOKEN LANGUAGE UNDERSTANDING
    Upadhyay, Shyam
    Faruqui, Manaal
    Tur, Gokhan
    Hakkani-Tur, Dilek
    Heck, Larry
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6034 - 6038
  • [33] Zero-Shot Hyperspectral Sharpening
    Dian, Renwei
    Guo, Anjing
    Li, Shutao
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12650 - 12666
  • [34] Zero-shot Adversarial Quantization
    Liu, Yuang
    Zhang, Wei
    Wang, Jun
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1512 - 1521
  • [35] Zero-Shot Kernel Learning
    Zhang, Hongguang
    Koniusz, Piotr
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7670 - 7679
  • [36] Zero-shot causal learning
    Nilforoshan, Hamed
    Moor, Michael
    Roohani, Yusuf
    Chen, Yining
    Surina, Anja
    Yasunaga, Michihiro
    Oblak, Sara
    Leskovec, Jure
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [37] Zero-Shot Object Counting
    Xu, Jingyi
    Le, Hieu
    Nguyen, Vu
    Ranjan, Viresh
    Samaras, Dimitris
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15548 - 15557
  • [38] Ordinal Zero-Shot Learning
    Huo, Zengwei
    Geng, Xin
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1916 - 1922
  • [39] Zero-shot Model Diagnosis
    Luo, Jinqi
    Wang, Zhaoning
    Wu, Chen Henry
    Huang, Dong
    De la Torre, Fernando
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11631 - 11640
  • [40] Zero-Shot Machine Unlearning
    Chundawat, Vikram S.
    Tarun, Ayush K.
    Mandal, Murari
    Kankanhalli, Mohan
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 2345 - 2354