G-Transformer for Document-level Machine Translation

被引:0
|
作者
Bao, Guangsheng [1 ,2 ]
Zhang, Yue [1 ,2 ]
Teng, Zhiyang [1 ,2 ]
Chen, Boxing [3 ]
Luo, Weihua [3 ]
机构
[1] Westlake Univ, Sch Engn, Hangzhou, Peoples R China
[2] Westlake Inst Adv Study, Inst Adv Technol, Hangzhou, Peoples R China
[3] Alibaba Grp Inc, DAMO Acad, Hangzhou, Peoples R China
来源
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1 | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Document-level MT models are still far from satisfactory. Existing work extend translation unit from single sentence to multiple sentences. However, study shows that when we further enlarge the translation unit to a whole document, supervised training of Transformer can fail. In this paper, we find such failure is not caused by overfitting, but by sticking around local minima during training. Our analysis shows that the increased complexity of target-to-source attention is a reason for the failure. As a solution, we propose G-Transformer, introducing locality assumption as an inductive bias into Transformer, reducing the hypothesis space of the attention from target to source. Experiments show that G-Transformer converges faster and more stably than Transformer, achieving new state-of-the-art BLEU scores for both nonpretraining and pre-training settings on three benchmark datasets.
引用
收藏
页码:3442 / 3455
页数:14
相关论文
共 50 条
  • [1] Multi-Hop Transformer for Document-Level Machine Translation
    Zhang, Long
    Zhang, Tong
    Zhang, Haibo
    Yang, Baosong
    Ye, Wei
    Zhang, Shikun
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3953 - 3963
  • [2] TANDO: A Corpus for Document-level Machine Translation
    Gete, Harritxu
    Etchegoyhen, Thierry
    Ponce, David
    Labaka, Gorka
    Aranberri, Nora
    Corral, Ander
    Saralegi, Xabier
    Santos, Igor Ellakuria
    Martin, Maite
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3026 - 3037
  • [3] Rethinking Document-level Neural Machine Translation
    Sun, Zewei
    Wang, Mingxuan
    Zhou, Hao
    Zhao, Chengqi
    Huang, Shujian
    Chen, Jiajun
    Li, Lei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3537 - 3548
  • [4] Document-Level Adaptation for Neural Machine Translation
    Kothur, Sachith Sri Ram
    Knowles, Rebecca
    Koehn, Philipp
    NEURAL MACHINE TRANSLATION AND GENERATION, 2018, : 64 - 73
  • [5] Improving the Transformer Translation Model with Document-Level Context
    Zhang, Jiacheng
    Luan, Huanbo
    Sun, Maosong
    Zhai, FeiFei
    Xu, Jingfang
    Zhang, Min
    Liu, Yang
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 533 - 542
  • [6] Corpora for Document-Level Neural Machine Translation
    Liu, Siyou
    Zhang, Xiaojun
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3775 - 3781
  • [7] Document-Level Machine Translation as a Re-translation Process
    Martinez Garcia, Eva
    Espana-Bonet, Cristina
    Marquez, Lluis
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2014, (53): : 103 - 110
  • [8] On Search Strategies for Document-Level Neural Machine Translation
    Herold, Christian
    Ney, Hermann
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 12827 - 12836
  • [9] Document-Level Machine Translation with Large Language Models
    Wang, Longyue
    Lyu, Chenyang
    Ji, Tianbo
    Zhang, Zhirui
    Yu, Dian
    Shi, Shuming
    Tu, Zhaopeng
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 16646 - 16661
  • [10] Exploring Discourse Structure in Document-level Machine Translation
    Hu, Xinyu
    Wan, Xiaojun
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13889 - 13902