VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation

被引:0
|
作者
Luo, Fuli [1 ]
Wang, Wei [1 ]
Liu, Jiahao [1 ]
Liu, Yijia [1 ]
Bi, Bin [1 ]
Huang, Songfang [1 ]
Huang, Fei [1 ]
Si, Luo [1 ]
机构
[1] Alibaba Grp, Hangzhou, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing work in multilingual pretraining has demonstrated the potential of cross-lingual transferability by training a unified Transformer encoder for multiple languages. However, much of this work only relies on the shared vocabulary and bilingual contexts to encourage the correlation across languages, which is loose and implicit for aligning the contextual representations between languages. In this paper, we plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages. It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language. More importantly, when fine-tuning on downstream tasks, the cross-attention module can be plugged in or out on-demand, thus naturally benefiting a wider range of cross-lingual tasks, from language understanding to generation. As a result, the proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark, covering text classification, sequence labeling, question answering, and sentence retrieval. For cross-lingual generation tasks, it also outperforms all existing cross-lingual models and state-of-the-art Transformer variants on WMT14 English-to-German and English-to-French translation datasets, with gains of up to 1 similar to 2 BLEU.
引用
收藏
页码:3980 / 3994
页数:15
相关论文
共 50 条
  • [1] Cross-Lingual Natural Language Generation via Pre-Training
    Chi, Zewen
    Dong, Li
    Wei, Furu
    Wang, Wenhui
    Mao, Xian-Ling
    Huang, Heyan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7570 - 7577
  • [2] Alternating Language Modeling for Cross-Lingual Pre-Training
    Yang, Jian
    Ma, Shuming
    Zhang, Dongdong
    Wu, Shuangzhi
    Li, Zhoujun
    Zhou, Ming
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9386 - 9393
  • [3] XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation
    Liang, Yaobo
    Duan, Nan
    Gong, Yeyun
    Wu, Ning
    Guo, Fenfei
    Qi, Weizhen
    Gong, Ming
    Shou, Linjun
    Jiang, Daxin
    Cao, Guihong
    Fan, Xiaodong
    Zhang, Ruofei
    Agrawal, Rahul
    Cui, Edward
    Wei, Sining
    Bharti, Taroon
    Qiao, Ying
    Chen, Jiun-Hung
    Wu, Winnie
    Liu, Shuguang
    Yang, Fan
    Campos, Daniel
    Majumder, Rangan
    Zhou, Ming
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6008 - 6018
  • [4] Mixed-Lingual Pre-training for Cross-lingual Summarization
    Xu, Ruochen
    Zhu, Chenguang
    Shi, Yu
    Zeng, Michael
    Huang, Xuedong
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 536 - 541
  • [5] Unicoder: A Universal Language Encoder by Pre-training with Multiple Cross-lingual Tasks
    Huang, Haoyang
    Liang, Yaobo
    Duan, Nan
    Gong, Ming
    Shou, Linjun
    Jiang, Daxin
    Zhou, Ming
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2485 - 2494
  • [6] Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training
    Zheng, Bo
    Dong, Li
    Huang, Shaohan
    Singhal, Saksham
    Che, Wanxiang
    Liu, Ting
    Song, Xia
    Wei, Furu
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3203 - 3215
  • [7] XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
    Chi, Zewen
    Huang, Shaohan
    Dong, Li
    Ma, Shuming
    Zheng, Bo
    Singhal, Saksham
    Bajaj, Payal
    Song, Xia
    Mao, Xian-Ling
    Huang, Heyan
    Wei, Furu
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6170 - 6182
  • [8] INFOXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training
    Chi, Zewen
    Dong, Li
    Wei, Furu
    Yang, Nan
    Singhal, Saksham
    Wang, Wenhui
    Song, Xia
    Mao, Xian-Ling
    Huang, Heyan
    Zhou, Ming
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3576 - 3588
  • [9] On-the-fly Cross-lingual Masking for Multilingual Pre-training
    Ai, Xi
    Fang, Bin
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 855 - 876
  • [10] Multi-Granularity Contrasting for Cross-Lingual Pre-Training
    Li, Shicheng
    Yang, Pengcheng
    Luo, Fuli
    Xie, Jun
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1708 - 1717