Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders

被引:0
|
作者
Chen, Guanhua [1 ]
Ma, Shuming [2 ]
Chen, Yun [3 ]
Dong, Li [2 ]
Zhang, Dongdong [2 ]
Pan, Jia [1 ]
Wang, Wenping [1 ,4 ]
Wei, Furu [2 ]
机构
[1] Univ Hong Kong, Hong Kong, Peoples R China
[2] Microsoft Res, Redmond, WA USA
[3] Shanghai Univ Finance & Econ, Shanghai, Peoples R China
[4] Texas A&M Univ, College Stn, TX 77843 USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous work mainly focuses on improving cross-lingual transfer for NLU tasks with a multilingual pretrained encoder (MPE), or improving the performance on supervised machine translation with BERT. However, it is under-explored that whether the MPE can help to facilitate the cross-lingual transferability of NMT model. In this paper, we focus on a zero-shot cross-lingual transfer task in NMT. In this task, the NMT model is trained with parallel dataset of only one language pair and an off-the-shelf MPE, then it is directly tested on zero-shot language pairs. We propose SixT, a simple yet effective model for this task. SixT leverages the MPE with a two-stage training schedule and gets further improvement with a position disentangled encoder and a capacity-enhanced decoder. Using this method, SixT significantly outperforms mBART, a pretrained multilingual encoder-decoder model explicitly designed for NMT, with an average improvement of 7.1 BLEU on zero-shot any-to-English test sets across 14 source languages. Furthermore, with much less training computation cost and training data, our model achieves better performance on 15 any-to-English test sets than CRISS and m2m-100, two strong multilingual NMT baselines.
引用
收藏
页码:15 / 26
页数:12
相关论文
共 50 条
  • [41] Learn and Consolidate: Continual Adaptation for Zero-Shot and Multilingual Neural Machine Translation
    Huang, Kaiyu
    Li, Peng
    Liu, Junpeng
    Sung, Maosong
    Liu, Yang
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13938 - 13951
  • [42] On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation
    Chen, Liang
    Ma, Shuming
    Zhang, Dongdong
    Wei, Furu
    Chang, Baobao
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 9542 - 9558
  • [43] (ALMOST) ZERO-SHOT CROSS-LINGUAL SPOKEN LANGUAGE UNDERSTANDING
    Upadhyay, Shyam
    Faruqui, Manaal
    Tur, Gokhan
    Hakkani-Tur, Dilek
    Heck, Larry
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6034 - 6038
  • [44] Cross-lingual Contextualized Topic Models with Zero-shot Learning
    Bianchi, Federico
    Terragni, Silvia
    Hovy, Dirk
    Nozza, Debora
    Fersini, Elisabetta
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1676 - 1683
  • [45] Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
    Xu, Qiantong
    Baevski, Alexei
    Auli, Michael
    INTERSPEECH 2022, 2022, : 2113 - 2117
  • [46] Cross-Lingual BERT Transformation for Zero-Shot Dependency Parsing
    Wang, Yuxuan
    Che, Wanxiang
    Guo, Jiang
    Liu, Yijia
    Liu, Ting
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5721 - 5727
  • [47] Zero-Shot Learning for Cross-Lingual News Sentiment Classification
    Pelicon, Andraz
    Pranjic, Marko
    Miljkovic, Dragana
    Skrlj, Blaz
    Pollak, Senja
    APPLIED SCIENCES-BASEL, 2020, 10 (17):
  • [48] Soft Layer Selection with Meta-Learning for Zero-Shot Cross-Lingual Transfer
    Xu, Weijia
    Haider, Batool
    Krone, Jason
    Mansour, Saab
    1ST WORKSHOP ON META LEARNING AND ITS APPLICATIONS TO NATURAL LANGUAGE PROCESSING (METANLP 2021), 2021, : 11 - 18
  • [49] BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer
    Parovic, Marinela
    Glavas, Goran
    Vulic, Ivan
    Korhonen, Anna
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1791 - 1799
  • [50] An Empirical Investigation of Word Alignment Supervision for Zero-Shot Multilingual Neural Machine Translation
    Raganato, Alessandro
    Vazquez, Raul
    Creutz, Mathias
    Tiedemann, Jorg
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8449 - 8456