VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation

被引:0
|
作者
Luo, Fuli [1 ]
Wang, Wei [1 ]
Liu, Jiahao [1 ]
Liu, Yijia [1 ]
Bi, Bin [1 ]
Huang, Songfang [1 ]
Huang, Fei [1 ]
Si, Luo [1 ]
机构
[1] Alibaba Grp, Hangzhou, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing work in multilingual pretraining has demonstrated the potential of cross-lingual transferability by training a unified Transformer encoder for multiple languages. However, much of this work only relies on the shared vocabulary and bilingual contexts to encourage the correlation across languages, which is loose and implicit for aligning the contextual representations between languages. In this paper, we plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages. It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language. More importantly, when fine-tuning on downstream tasks, the cross-attention module can be plugged in or out on-demand, thus naturally benefiting a wider range of cross-lingual tasks, from language understanding to generation. As a result, the proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark, covering text classification, sequence labeling, question answering, and sentence retrieval. For cross-lingual generation tasks, it also outperforms all existing cross-lingual models and state-of-the-art Transformer variants on WMT14 English-to-German and English-to-French translation datasets, with gains of up to 1 similar to 2 BLEU.
引用
收藏
页码:3980 / 3994
页数:15
相关论文
共 50 条
  • [41] XLPT-AMR: Cross-Lingual Pre-Training via Multi-Task Learning for Zero-Shot AMR Parsing and Text Generation
    Xu, Dongqin
    Li, Junhui
    Zhu, Muhua
    Zhang, Min
    Zhou, Guodong
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 896 - 907
  • [42] (ALMOST) ZERO-SHOT CROSS-LINGUAL SPOKEN LANGUAGE UNDERSTANDING
    Upadhyay, Shyam
    Faruqui, Manaal
    Tur, Gokhan
    Hakkani-Tur, Dilek
    Heck, Larry
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6034 - 6038
  • [43] Understanding Translationese in Cross-Lingual Summarization
    Wang, Jiaan
    Meng, Fandong
    Liang, Yunlong
    Zhang, Tingyi
    Xu, Jiarong
    Li, Zhixu
    Zhou, Jie
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3837 - 3849
  • [44] Cross-lingual Language Model Pretraining
    Conneau, Alexis
    Lample, Guillaume
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [45] Language Agnostic Speaker Embedding for Cross-Lingual Personalized Speech Generation
    Zhou, Yi
    Tian, Xiaohai
    Li, Haizhou
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3427 - 3439
  • [46] BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
    Li, Junnan
    Li, Dongxu
    Xiong, Caiming
    Hoi, Steven
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [47] MLUG: Bootstrapping Language-Motion Pre-Training for Unified Motion-Language Understanding and Generation
    Luo, Hongliang
    Xi, Wei
    Tang, Daniel
    SENSORS, 2024, 24 (22)
  • [48] Cross-Lingual Image Caption Generation
    Miyazaki, Takashi
    Shimizu, Nobuyuki
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1780 - 1790
  • [49] Self-training Improves Pre-training for Natural Language Understanding
    Du, Jingfei
    Grave, Edouard
    Gunel, Beliz
    Chaudhary, Vishrav
    Celebi, Onur
    Auli, Michael
    Stoyanov, Veselin
    Conneau, Alexis
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5408 - 5418
  • [50] Fusion or Defusion? Flexible Vision-and-Language Pre-Training
    Sun, Rongyi
    Li, Ziran
    Ding, Yifeng
    Wang, Qifan
    Wang, Jingang
    Zheng, Hai-Tao
    Wu, Wei
    Xian, Yunsen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5105 - 5119