VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation

被引：0

作者：

Luo, Fuli ^{[1
]}

Wang, Wei ^{[1
]}

Liu, Jiahao ^{[1
]}

Liu, Yijia ^{[1
]}

Bi, Bin ^{[1
]}

Huang, Songfang ^{[1
]}

Huang, Fei ^{[1
]}

Si, Luo ^{[1
]}

机构：

[1] Alibaba Grp, Hangzhou, Peoples R China

来源：

59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1 | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Existing work in multilingual pretraining has demonstrated the potential of cross-lingual transferability by training a unified Transformer encoder for multiple languages. However, much of this work only relies on the shared vocabulary and bilingual contexts to encourage the correlation across languages, which is loose and implicit for aligning the contextual representations between languages. In this paper, we plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages. It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language. More importantly, when fine-tuning on downstream tasks, the cross-attention module can be plugged in or out on-demand, thus naturally benefiting a wider range of cross-lingual tasks, from language understanding to generation. As a result, the proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark, covering text classification, sequence labeling, question answering, and sentence retrieval. For cross-lingual generation tasks, it also outperforms all existing cross-lingual models and state-of-the-art Transformer variants on WMT14 English-to-German and English-to-French translation datasets, with gains of up to 1 similar to 2 BLEU.

引用

页码：3980 / 3994

页数：15

共 50 条

[31] MULTI-STYLE ADAPTIVE TRAINING FOR ROBUST CROSS-LINGUAL SPOKEN LANGUAGE UNDERSTANDING
He, Xiaodong
Deng, Li
Hakkani-Tur, Dilek
Tur, Gokhan
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8342 - 8346
[32] Multimodal Pre-training Method for Vision-language Understanding and Generation
Liu T.-Y.
Wu Z.-X.
Chen J.-J.
Jiang Y.-G.
Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2024 - 2034
[33] An analysis on language transfer of pre-trained language model with cross-lingual post-training
Son, Suhyune
Park, Chanjun
Lee, Jungseob
Shim, Midan
Lee, Chanhee
Jang, Yoonna
Seo, Jaehyung
Lim, Jungwoo
Lim, Heuiseok
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 267
[34] Cross-lingual Spoken Language Understanding with Regularized Representation Alignment
Liu, Zihan
Winata, Genta Indra
Xu, Peng
Lin, Zhaojiang
Fung, Pascale
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 7241 - 7251
[35] FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding
Fang, Yuwei
Wang, Shuohang
Gan, Zhe
Sun, Siqi
Liu, Jingjing
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 12776 - 12784
[36] EMMA- X: An EM-like Multilingual Pre-training Algorithm for Cross-lingual Representation Learning
Guo, Ping
Wei, Xiangpeng
Hu, Yue
Yang, Baosong
Liu, Dayiheng
Huang, Fei
Xie, Jun
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[37] Unified pre-training for program understanding and generation
Ahmad, Wasi Uddin
Chakraborty, Saikat
Ray, Baishakhi
Chang, Kai-Wei
arXiv, 2021,
[38] Unified Pre-training for Program Understanding and Generation
Ahmad, Wasi Uddin
Chakraborty, Saikat
Ray, Baishakhi
Chang, Kai-Wei
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2655 - 2668
[39] ZmBART: An Unsupervised Cross-lingual Transfer Framework for Language Generation
Maurya, Kaushal Kumar
Desarkar, Maunendra Sankar
Kano, Yoshinobu
Deepshikha, Kumari
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2804 - 2818
[40] MPNet: Masked and Permuted Pre-training for Language Understanding
Song, Kaitao
Tan, Xu
Qin, Tao
Lu, Jianfeng
Liu, Tie-Yan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33

← 1 2 3 4 5 →