Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders

被引:0
|
作者
Chen, Guanhua [1 ]
Ma, Shuming [2 ]
Chen, Yun [3 ]
Dong, Li [2 ]
Zhang, Dongdong [2 ]
Pan, Jia [1 ]
Wang, Wenping [1 ,4 ]
Wei, Furu [2 ]
机构
[1] Univ Hong Kong, Hong Kong, Peoples R China
[2] Microsoft Res, Redmond, WA USA
[3] Shanghai Univ Finance & Econ, Shanghai, Peoples R China
[4] Texas A&M Univ, College Stn, TX 77843 USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous work mainly focuses on improving cross-lingual transfer for NLU tasks with a multilingual pretrained encoder (MPE), or improving the performance on supervised machine translation with BERT. However, it is under-explored that whether the MPE can help to facilitate the cross-lingual transferability of NMT model. In this paper, we focus on a zero-shot cross-lingual transfer task in NMT. In this task, the NMT model is trained with parallel dataset of only one language pair and an off-the-shelf MPE, then it is directly tested on zero-shot language pairs. We propose SixT, a simple yet effective model for this task. SixT leverages the MPE with a two-stage training schedule and gets further improvement with a position disentangled encoder and a capacity-enhanced decoder. Using this method, SixT significantly outperforms mBART, a pretrained multilingual encoder-decoder model explicitly designed for NMT, with an average improvement of 7.1 BLEU on zero-shot any-to-English test sets across 14 source languages. Furthermore, with much less training computation cost and training data, our model achieves better performance on 15 any-to-English test sets than CRISS and m2m-100, two strong multilingual NMT baselines.
引用
收藏
页码:15 / 26
页数:12
相关论文
共 50 条
  • [21] Zero-shot Cross-lingual Transfer is Under-specified Optimization
    Wu, Shijie
    Van Durme, Benjamin
    Dredze, Mark
    PROCEEDINGS OF THE 7TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP, 2022, : 236 - 248
  • [22] Cross-Lingual Transfer in Zero-Shot Cross-Language Entity Linking
    Schumacher, Elliot
    Mayfield, James
    Dredze, Mark
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 583 - 595
  • [23] Realistic Zero-Shot Cross-Lingual Transfer in Legal Topic Classification
    Xenouleas, Stratos
    Tsoukara, Alexia
    Panagiotakis, Giannis
    Chalkidis, Ilias
    Androutsopoulos, Ion
    PROCEEDINGS OF THE 12TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE, SETN 2022, 2022,
  • [24] Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction
    Huang, Kuan-Hao
    Hsu, I-Hung
    Natarajan, Premkumar
    Chang, Kai-Wei
    Peng, Nanyun
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4633 - 4646
  • [25] Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
    Huang, Po-Yao
    Patrick, Mandela
    Hu, Junjie
    Neubig, Graham
    Metze, Florian
    Hauptmann, Alexander
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2443 - 2459
  • [26] Zero-Shot Cross-Lingual Opinion Target Extraction
    Jebbara, Soufian
    Cimiano, Philipp
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2486 - 2495
  • [27] Evaluating the Cross-Lingual Effectiveness of Massively Multilingual Neural Machine Translation
    Siddhant, Aditya
    Johnson, Melvin
    Tsai, Henry
    Ari, Naveen
    Riesa, Jason
    Bapna, Ankur
    Firat, Orhan
    Raman, Karthik
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8854 - 8861
  • [28] XeroAlign: Zero-Shot Cross-lingual Transformer Alignment
    Gritta, Milan
    Iacobacci, Ignacio
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 371 - 381
  • [29] Zero-shot Cross-lingual Alignment for Embedding Initialization
    Ai, Xi
    Huang, Zhiyong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 5997 - 6007
  • [30] Zero-Shot Cross-Lingual Transfer in Legal Domain Using Transformer Models
    Shaheen, Zein
    Wohlgenannt, Gerhard
    Mouromtsev, Dmitry
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 450 - 456