G-Transformer for Document-level Machine Translation

被引：0

作者：

Bao, Guangsheng ^{[1
,2
]}

Zhang, Yue ^{[1
,2
]}

Teng, Zhiyang ^{[1
,2
]}

Chen, Boxing ^{[3
]}

Luo, Weihua ^{[3
]}

机构：

[1] Westlake Univ, Sch Engn, Hangzhou, Peoples R China

[2] Westlake Inst Adv Study, Inst Adv Technol, Hangzhou, Peoples R China

[3] Alibaba Grp Inc, DAMO Acad, Hangzhou, Peoples R China

来源：

59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1 | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Document-level MT models are still far from satisfactory. Existing work extend translation unit from single sentence to multiple sentences. However, study shows that when we further enlarge the translation unit to a whole document, supervised training of Transformer can fail. In this paper, we find such failure is not caused by overfitting, but by sticking around local minima during training. Our analysis shows that the increased complexity of target-to-source attention is a reason for the failure. As a solution, we propose G-Transformer, introducing locality assumption as an inductive bias into Transformer, reducing the hypothesis space of the attention from target to source. Experiments show that G-Transformer converges faster and more stably than Transformer, achieving new state-of-the-art BLEU scores for both nonpretraining and pre-training settings on three benchmark datasets.

引用

页码：3442 / 3455

页数：14

共 50 条

[1] Multi-Hop Transformer for Document-Level Machine Translation
Zhang, Long
Zhang, Tong
Zhang, Haibo
Yang, Baosong
Ye, Wei
Zhang, Shikun
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3953 - 3963
[2] TANDO: A Corpus for Document-level Machine Translation
Gete, Harritxu
Etchegoyhen, Thierry
Ponce, David
Labaka, Gorka
Aranberri, Nora
Corral, Ander
Saralegi, Xabier
Santos, Igor Ellakuria
Martin, Maite
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3026 - 3037
[3] Rethinking Document-level Neural Machine Translation
Sun, Zewei
Wang, Mingxuan
Zhou, Hao
Zhao, Chengqi
Huang, Shujian
Chen, Jiajun
Li, Lei
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3537 - 3548
[4] Document-Level Adaptation for Neural Machine Translation
Kothur, Sachith Sri Ram
Knowles, Rebecca
Koehn, Philipp
NEURAL MACHINE TRANSLATION AND GENERATION, 2018, : 64 - 73
[5] Improving the Transformer Translation Model with Document-Level Context
Zhang, Jiacheng
Luan, Huanbo
Sun, Maosong
Zhai, FeiFei
Xu, Jingfang
Zhang, Min
Liu, Yang
2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 533 - 542
[6] Corpora for Document-Level Neural Machine Translation
Liu, Siyou
Zhang, Xiaojun
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3775 - 3781
[7] Document-Level Machine Translation as a Re-translation Process
Martinez Garcia, Eva
Espana-Bonet, Cristina
Marquez, Lluis
PROCESAMIENTO DEL LENGUAJE NATURAL, 2014, (53): : 103 - 110
[8] On Search Strategies for Document-Level Neural Machine Translation
Herold, Christian
Ney, Hermann
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 12827 - 12836
[9] Document-Level Machine Translation with Large Language Models
Wang, Longyue
Lyu, Chenyang
Ji, Tianbo
Zhang, Zhirui
Yu, Dian
Shi, Shuming
Tu, Zhaopeng
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 16646 - 16661
[10] Exploring Discourse Structure in Document-level Machine Translation
Hu, Xinyu
Wan, Xiaojun
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13889 - 13902

← 1 2 3 4 5 →