VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation

被引：0

作者：

Luo, Fuli ^{[1
]}

Wang, Wei ^{[1
]}

Liu, Jiahao ^{[1
]}

Liu, Yijia ^{[1
]}

Bi, Bin ^{[1
]}

Huang, Songfang ^{[1
]}

Huang, Fei ^{[1
]}

Si, Luo ^{[1
]}

机构：

[1] Alibaba Grp, Hangzhou, Peoples R China

来源：

59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1 | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Existing work in multilingual pretraining has demonstrated the potential of cross-lingual transferability by training a unified Transformer encoder for multiple languages. However, much of this work only relies on the shared vocabulary and bilingual contexts to encourage the correlation across languages, which is loose and implicit for aligning the contextual representations between languages. In this paper, we plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages. It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language. More importantly, when fine-tuning on downstream tasks, the cross-attention module can be plugged in or out on-demand, thus naturally benefiting a wider range of cross-lingual tasks, from language understanding to generation. As a result, the proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark, covering text classification, sequence labeling, question answering, and sentence retrieval. For cross-lingual generation tasks, it also outperforms all existing cross-lingual models and state-of-the-art Transformer variants on WMT14 English-to-German and English-to-French translation datasets, with gains of up to 1 similar to 2 BLEU.

引用

页码：3980 / 3994

页数：15

共 50 条

[41] XLPT-AMR: Cross-Lingual Pre-Training via Multi-Task Learning for Zero-Shot AMR Parsing and Text Generation
Xu, Dongqin
Li, Junhui
Zhu, Muhua
Zhang, Min
Zhou, Guodong
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 896 - 907
[42] (ALMOST) ZERO-SHOT CROSS-LINGUAL SPOKEN LANGUAGE UNDERSTANDING
Upadhyay, Shyam
Faruqui, Manaal
Tur, Gokhan
Hakkani-Tur, Dilek
Heck, Larry
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6034 - 6038
[43] Understanding Translationese in Cross-Lingual Summarization
Wang, Jiaan
Meng, Fandong
Liang, Yunlong
Zhang, Tingyi
Xu, Jiarong
Li, Zhixu
Zhou, Jie
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3837 - 3849
[44] Cross-lingual Language Model Pretraining
Conneau, Alexis
Lample, Guillaume
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[45] Language Agnostic Speaker Embedding for Cross-Lingual Personalized Speech Generation
Zhou, Yi
Tian, Xiaohai
Li, Haizhou
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3427 - 3439
[46] BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Li, Junnan
Li, Dongxu
Xiong, Caiming
Hoi, Steven
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[47] MLUG: Bootstrapping Language-Motion Pre-Training for Unified Motion-Language Understanding and Generation
Luo, Hongliang
Xi, Wei
Tang, Daniel
SENSORS, 2024, 24 (22)
[48] Cross-Lingual Image Caption Generation
Miyazaki, Takashi
Shimizu, Nobuyuki
PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1780 - 1790
[49] Self-training Improves Pre-training for Natural Language Understanding
Du, Jingfei
Grave, Edouard
Gunel, Beliz
Chaudhary, Vishrav
Celebi, Onur
Auli, Michael
Stoyanov, Veselin
Conneau, Alexis
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5408 - 5418
[50] Fusion or Defusion? Flexible Vision-and-Language Pre-Training
Sun, Rongyi
Li, Ziran
Ding, Yifeng
Wang, Qifan
Wang, Jingang
Zheng, Hai-Tao
Wu, Wei
Xian, Yunsen
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5105 - 5119

← 1 2 3 4 5 →