Efficient Context-Aware Neural Machine Translation with Layer-Wise Weighting and Input-Aware Gating

被引：0

作者：

Xu, Hongfei ^{[1
,2
]}

Xiong, Deyi ^{[3
]}

van Genabith, Josef ^{[1
,2
]}

Liu, Qiuhui ^{[4
]}

机构：

[1] Saarland Univ, Saarbrucken, Germany

[2] German Res Ctr Artificial Intelligence, Kaiserslautern, Germany

[3] Tianjin Univ, Tianjin, Peoples R China

[4] China Mobile Online Serv, Hong Kong, Peoples R China

来源：

PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2020年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Existing Neural Machine Translation (NMT) systems are generally trained on a large amount of sentence-level parallel data, and during prediction sentences are independently translated, ignoring cross-sentence contextual information. This leads to inconsistency between translated sentences. In order to address this issue, context-aware models have been proposed. However, document-level parallel data constitutes only a small part of the parallel data available, and many approaches build context-aware models based on a pre-trained frozen sentence-level translation model in a two-step training manner. The computational cost of these approaches is usually high. In this paper, we propose to make the most of layers pre-trained on sentence-level data in contextual representation learning, reusing representations from the sentence-level Transformer and significantly reducing the cost of incorporating contexts in translation. We find that representations from shallow layers of a pre-trained sentence-level encoder play a vital role in source context encoding, and propose to perform source context encoding upon weighted combinations of pre-trained encoder layers' outputs. Instead of separately performing source context and input encoding, we propose to iteratively and jointly encode the source input and its contexts and to generate input-aware context representations with a cross-attention layer and a gating mechanism, which resets irrelevant information in context encoding. Our context-aware Transformer model outperforms the recent CADec [Voita et al., 2019c] on the English-Russian subtitle data and is about twice as fast in training and decoding.

引用

页码：3933 / 3940

页数：8

共 45 条

[1] Challenges in Context-Aware Neural Machine Translation
Jinn, Linghao
Het, Jacqueline
May, Jonathan
Ma, Xuezhe
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15246 - 15263
[2] A study of BERT for context-aware neural machine translation
Xueqing Wu
Yingce Xia
Jinhua Zhu
Lijun Wu
Shufang Xie
Tao Qin
Machine Learning, 2022, 111 : 917 - 935
[3] Context-Aware Monolingual Repair for Neural Machine Translation
Voita, Elena
Sennrich, Rico
Titov, Ivan
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 877 - 886
[4] A Context-Aware Recurrent Encoder for Neural Machine Translation
Zhang, Biao
Xiong, Deyi
Su, Jinsong
Duan, Hong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (12) : 2424 - 2432
[5] A study of BERT for context-aware neural machine translation
Wu, Xueqing
Xia, Yingce
Zhu, Jinhua
Wu, Lijun
Xie, Shufang
Qin, Tao
MACHINE LEARNING, 2022, 111 (03) : 917 - 935
[6] Selective Attention for Context-aware Neural Machine Translation
Maruf, Sameen
Martins, Andre F. T.
Haffari, Gholamreza
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3092 - 3102
[7] Context-Aware Neural Machine Translation Learns Anaphora Resolution
Voita, Elena
Serdyukov, Pavel
Sennrich, Rico
Titov, Ivan
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1264 - 1274
[8] Context-Aware Neural Machine Translation for Korean Honorific Expressions
Hwang, Yongkeun
Kim, Yanghoon
Jung, Kyomin
ELECTRONICS, 2021, 10 (13)
[9] One Type Context Is Not Enough: Global Context-aware Neural Machine Translation
Chen, Linqing
Li, Junhui
Gong, Zhengxian
Zhang, Min
Zhou, Guodong
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (06)
[10] Context-Aware Linguistic Steganography Model Based on Neural Machine Translation
Ding, Changhao
Fu, Zhangjie
Yang, Zhongliang
Yu, Qi
Li, Daqiu
Huang, Yongfeng
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 868 - 878

← 1 2 3 4 5 →