Efficient Context-Aware Neural Machine Translation with Layer-Wise Weighting and Input-Aware Gating

被引:0
|
作者
Xu, Hongfei [1 ,2 ]
Xiong, Deyi [3 ]
van Genabith, Josef [1 ,2 ]
Liu, Qiuhui [4 ]
机构
[1] Saarland Univ, Saarbrucken, Germany
[2] German Res Ctr Artificial Intelligence, Kaiserslautern, Germany
[3] Tianjin Univ, Tianjin, Peoples R China
[4] China Mobile Online Serv, Hong Kong, Peoples R China
来源
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2020年
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing Neural Machine Translation (NMT) systems are generally trained on a large amount of sentence-level parallel data, and during prediction sentences are independently translated, ignoring cross-sentence contextual information. This leads to inconsistency between translated sentences. In order to address this issue, context-aware models have been proposed. However, document-level parallel data constitutes only a small part of the parallel data available, and many approaches build context-aware models based on a pre-trained frozen sentence-level translation model in a two-step training manner. The computational cost of these approaches is usually high. In this paper, we propose to make the most of layers pre-trained on sentence-level data in contextual representation learning, reusing representations from the sentence-level Transformer and significantly reducing the cost of incorporating contexts in translation. We find that representations from shallow layers of a pre-trained sentence-level encoder play a vital role in source context encoding, and propose to perform source context encoding upon weighted combinations of pre-trained encoder layers' outputs. Instead of separately performing source context and input encoding, we propose to iteratively and jointly encode the source input and its contexts and to generate input-aware context representations with a cross-attention layer and a gating mechanism, which resets irrelevant information in context encoding. Our context-aware Transformer model outperforms the recent CADec [Voita et al., 2019c] on the English-Russian subtitle data and is about twice as fast in training and decoding.
引用
收藏
页码:3933 / 3940
页数:8
相关论文
共 45 条
  • [1] Challenges in Context-Aware Neural Machine Translation
    Jinn, Linghao
    Het, Jacqueline
    May, Jonathan
    Ma, Xuezhe
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15246 - 15263
  • [2] A study of BERT for context-aware neural machine translation
    Xueqing Wu
    Yingce Xia
    Jinhua Zhu
    Lijun Wu
    Shufang Xie
    Tao Qin
    Machine Learning, 2022, 111 : 917 - 935
  • [3] Context-Aware Monolingual Repair for Neural Machine Translation
    Voita, Elena
    Sennrich, Rico
    Titov, Ivan
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 877 - 886
  • [4] A Context-Aware Recurrent Encoder for Neural Machine Translation
    Zhang, Biao
    Xiong, Deyi
    Su, Jinsong
    Duan, Hong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (12) : 2424 - 2432
  • [5] A study of BERT for context-aware neural machine translation
    Wu, Xueqing
    Xia, Yingce
    Zhu, Jinhua
    Wu, Lijun
    Xie, Shufang
    Qin, Tao
    MACHINE LEARNING, 2022, 111 (03) : 917 - 935
  • [6] Selective Attention for Context-aware Neural Machine Translation
    Maruf, Sameen
    Martins, Andre F. T.
    Haffari, Gholamreza
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3092 - 3102
  • [7] Context-Aware Neural Machine Translation Learns Anaphora Resolution
    Voita, Elena
    Serdyukov, Pavel
    Sennrich, Rico
    Titov, Ivan
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1264 - 1274
  • [8] Context-Aware Neural Machine Translation for Korean Honorific Expressions
    Hwang, Yongkeun
    Kim, Yanghoon
    Jung, Kyomin
    ELECTRONICS, 2021, 10 (13)
  • [9] One Type Context Is Not Enough: Global Context-aware Neural Machine Translation
    Chen, Linqing
    Li, Junhui
    Gong, Zhengxian
    Zhang, Min
    Zhou, Guodong
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (06)
  • [10] Context-Aware Linguistic Steganography Model Based on Neural Machine Translation
    Ding, Changhao
    Fu, Zhangjie
    Yang, Zhongliang
    Yu, Qi
    Li, Daqiu
    Huang, Yongfeng
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 868 - 878