Neural Machine Translation with Target-Attention Model

被引:6
|
作者
Yang, Mingming [1 ]
Zhang, Min [1 ,2 ]
Chen, Kehai [3 ]
Wang, Rui [3 ]
Zhao, Tiejun [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
[3] Natl Inst Informat & Commun Technol, Kyoto 6190289, Japan
基金
日本学术振兴会;
关键词
attention mechanism; neural machine translation; forward target-attention model; reverse target-attention model; bidirectional target-attention model;
D O I
10.1587/transinf.2019EDP7157
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Attention mechanism, which selectively focuses on source-side information to learn a context vector for generating target words, has been shown to be an effective method for neural machine translation (NMT). In fact, generating target words depends on not only the source-side information but also the target-side information. Although the vanilla NMT can acquire target-side information implicitly by recurrent neural networks (RNN), RNN cannot adequately capture the global relationship between target-side words. To solve this problem, this paper proposes a novel target-attention approach to capture this information, thus enhancing target word predictions in NMT. Specifically, we propose three variants of target-attention model to directly obtain the global relationship among target words: 1) a forward target-attention model that uses a target attention mechanism to incorporate previous historical target words into the prediction of the current target word; 2) a reverse target-attention model that adopts a reverse RNN model to obtain the entire reverse target words information, and then to combine with source context information to generate target sequence; 3) a bidirectional target-attention model that combines the forward target-attention model and reverse target-attention model together, which can make full use of target words to further improve the performance of NMT. Our methods can be integrated into both RNN based NMT and self-attention based NMT, and help NMT get global target-side information to improve translation performance. Experiments on the NIST Chinese-to-English and the WMT English-to-German translation tasks show that the proposed models achieve significant improvements over state-of-the-art baselines.
引用
收藏
页码:684 / 694
页数:11
相关论文
共 50 条
  • [31] Training Deeper Neural Machine Translation Models with Transparent Attention
    Bapna, Ankur
    Chen, Mia Xu
    Firat, Orhan
    Cao, Yuan
    Wu, Yonghui
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3028 - 3033
  • [32] Recursive Annotations for Attention-Based Neural Machine Translation
    Ye, Shaolin
    Guo, Wu
    2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 164 - 167
  • [33] Training with Adversaries to Improve Faithfulness of Attention in Neural Machine Translation
    Moradi, Pooya
    Kambhatla, Nishant
    Sarkar, Anoop
    AACL-IJCNLP 2020: THE 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2020, : 86 - 93
  • [34] Towards Understanding Neural Machine Translation with Attention Heads' Importance
    Zhou, Zijie
    Zhu, Junguo
    Li, Weijiang
    APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [35] Fine-grained attention mechanism for neural machine translation
    Choi, Heeyoul
    Cho, Kyunghyun
    Bengio, Yoshua
    NEUROCOMPUTING, 2018, 284 : 171 - 176
  • [36] Syntax-Based Attention Masking for Neural Machine Translation
    McDonald, Colin
    Chiang, David
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 47 - 52
  • [37] Selective Attention for Context-aware Neural Machine Translation
    Maruf, Sameen
    Martins, Andre F. T.
    Haffari, Gholamreza
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3092 - 3102
  • [38] Look-Ahead Attention for Generation in Neural Machine Translation
    Zhou, Long
    Zhang, Jiajun
    Zong, Chengqing
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 211 - 223
  • [39] Exploiting Target Language Data for Neural Machine Translation Beyond Back Translation
    Reheman, Abudurexiti
    Lu, Yingfeng
    Ruan, Junhao
    Ma, Anxiang
    Zhang, Chunliang
    Xiao, Tong
    Zhu, Jingbo
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 12216 - 12228
  • [40] On Target Representation in Continuous-output Neural Machine Translation
    Tokarchuk, Evgeniia
    Niculae, Vlad
    PROCEEDINGS OF THE 7TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP, 2022, : 227 - 235