Neural Machine Translation with Target-Attention Model

被引:6
|
作者
Yang, Mingming [1 ]
Zhang, Min [1 ,2 ]
Chen, Kehai [3 ]
Wang, Rui [3 ]
Zhao, Tiejun [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
[3] Natl Inst Informat & Commun Technol, Kyoto 6190289, Japan
基金
日本学术振兴会;
关键词
attention mechanism; neural machine translation; forward target-attention model; reverse target-attention model; bidirectional target-attention model;
D O I
10.1587/transinf.2019EDP7157
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Attention mechanism, which selectively focuses on source-side information to learn a context vector for generating target words, has been shown to be an effective method for neural machine translation (NMT). In fact, generating target words depends on not only the source-side information but also the target-side information. Although the vanilla NMT can acquire target-side information implicitly by recurrent neural networks (RNN), RNN cannot adequately capture the global relationship between target-side words. To solve this problem, this paper proposes a novel target-attention approach to capture this information, thus enhancing target word predictions in NMT. Specifically, we propose three variants of target-attention model to directly obtain the global relationship among target words: 1) a forward target-attention model that uses a target attention mechanism to incorporate previous historical target words into the prediction of the current target word; 2) a reverse target-attention model that adopts a reverse RNN model to obtain the entire reverse target words information, and then to combine with source context information to generate target sequence; 3) a bidirectional target-attention model that combines the forward target-attention model and reverse target-attention model together, which can make full use of target words to further improve the performance of NMT. Our methods can be integrated into both RNN based NMT and self-attention based NMT, and help NMT get global target-side information to improve translation performance. Experiments on the NIST Chinese-to-English and the WMT English-to-German translation tasks show that the proposed models achieve significant improvements over state-of-the-art baselines.
引用
收藏
页码:684 / 694
页数:11
相关论文
共 50 条
  • [1] Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings
    Kuang, Shaohui
    Li, Junhui
    Branco, Antonio
    Luo, Weihua
    Xiong, Deyi
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1767 - 1776
  • [2] History Attention for Source-Target Alignment in Neural Machine Translation
    Huang, Yan
    Chao, Wenhan
    Zhang, Peidong
    Yu, Yuanyuan
    PROCEEDINGS OF 2018 TENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2018, : 619 - 624
  • [3] Look Harder: A Neural Machine Translation Model with Hard Attention
    Indurthi, Sathish
    Chung, Insoo
    Kim, Sangha
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3037 - 3043
  • [4] A Visual Attention Grounding Neural Model for Multimodal Machine Translation
    Zhou, Mingyang
    Cheng, Runxiang
    Lee, Yong Jae
    Yu, Zhou
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3643 - 3653
  • [5] Neural Machine Translation With GRU-Gated Attention Model
    Zhang, Biao
    Xiong, Deyi
    Xie, Jun
    Su, Jinsong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4688 - 4698
  • [6] Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation
    Imamura, Kenji
    Fujita, Atsushi
    Sumita, Eiichiro
    NEURAL MACHINE TRANSLATION AND GENERATION, 2018, : 55 - 63
  • [7] Recurrent Attention for Neural Machine Translation
    Zeng, Jiali
    Wu, Shuangzhi
    Yin, Yongjing
    Jiang, Yufan
    Li, Mu
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3216 - 3225
  • [8] Neural Machine Translation with Deep Attention
    Zhang, Biao
    Xiong, Deyi
    Su, Jinsong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (01) : 154 - 163
  • [9] Attention-via-Attention Neural Machine Translation
    Zhao, Shenjian
    Zhang, Zhihua
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 563 - 570
  • [10] Object tracking in infrared images using a deep learning model and a target-attention mechanism
    Mahboub Parhizkar
    Gholamreza Karamali
    Bahram Abedi Ravan
    Complex & Intelligent Systems, 2023, 9 : 1495 - 1506