Neural Machine Translation with Target-Attention Model

被引:6
|
作者
Yang, Mingming [1 ]
Zhang, Min [1 ,2 ]
Chen, Kehai [3 ]
Wang, Rui [3 ]
Zhao, Tiejun [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
[3] Natl Inst Informat & Commun Technol, Kyoto 6190289, Japan
基金
日本学术振兴会;
关键词
attention mechanism; neural machine translation; forward target-attention model; reverse target-attention model; bidirectional target-attention model;
D O I
10.1587/transinf.2019EDP7157
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Attention mechanism, which selectively focuses on source-side information to learn a context vector for generating target words, has been shown to be an effective method for neural machine translation (NMT). In fact, generating target words depends on not only the source-side information but also the target-side information. Although the vanilla NMT can acquire target-side information implicitly by recurrent neural networks (RNN), RNN cannot adequately capture the global relationship between target-side words. To solve this problem, this paper proposes a novel target-attention approach to capture this information, thus enhancing target word predictions in NMT. Specifically, we propose three variants of target-attention model to directly obtain the global relationship among target words: 1) a forward target-attention model that uses a target attention mechanism to incorporate previous historical target words into the prediction of the current target word; 2) a reverse target-attention model that adopts a reverse RNN model to obtain the entire reverse target words information, and then to combine with source context information to generate target sequence; 3) a bidirectional target-attention model that combines the forward target-attention model and reverse target-attention model together, which can make full use of target words to further improve the performance of NMT. Our methods can be integrated into both RNN based NMT and self-attention based NMT, and help NMT get global target-side information to improve translation performance. Experiments on the NIST Chinese-to-English and the WMT English-to-German translation tasks show that the proposed models achieve significant improvements over state-of-the-art baselines.
引用
收藏
页码:684 / 694
页数:11
相关论文
共 50 条
  • [41] On Using Very Large Target Vocabulary for Neural Machine Translation
    Jean, Sbastien
    Cho, Kyunghyun
    Memisevic, Roland
    Bengio, Yoshua
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, 2015, : 1 - 10
  • [42] Conversational Model Using Neural Machine Translation
    Ramu, Agusthiyar
    Gokul, R.
    Sumesh, Nihal
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (04): : 159 - 162
  • [43] Analyzing the Source and Target Contributions to Predictions in Neural Machine Translation
    Voita, Elena
    Sennrich, Rico
    Titov, Ivan
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1126 - 1140
  • [44] On integrating a language model into neural machine translation
    Gulcehre, Caglar
    Firat, Orhan
    Xu, Kelvin
    Cho, Kyunghyun
    Bengio, Yoshua
    COMPUTER SPEECH AND LANGUAGE, 2017, 45 : 137 - 148
  • [45] Regularizing Neural Machine Translation by Target-Bidirectional Agreement
    Zhang, Zhirui
    Wu, Shuangzhi
    Liu, Shujie
    Li, Mu
    Zhou, Ming
    Xu, Tong
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 443 - 450
  • [46] A Convolutional Encoder Model for Neural Machine Translation
    Gehring, Jonas
    Auli, Michael
    Grangier, David
    Dauphin, Yann N.
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 123 - 135
  • [47] Neural Hidden Markov Model for Machine Translation
    Wang, Weiyue
    Zhu, Derui
    Alkhouli, Tamer
    Gan, Zixuan
    Ney, Hermann
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 377 - 382
  • [48] A Non-Autoregressive Neural Machine Translation Model With Iterative Length Update of Target Sentence
    Lim, Yeon-Soo
    Park, Eun-Ju
    Song, Hyun-Je
    Park, Seong-Bae
    IEEE ACCESS, 2022, 10 : 43341 - 43350
  • [49] Machine Translation for Indian Languages Utilizing Recurrent Neural Networks and Attention
    Sharma, Sonali
    Diwakar, Manoj
    DISTRIBUTED COMPUTING AND OPTIMIZATION TECHNIQUES, ICDCOT 2021, 2022, 903 : 593 - 602
  • [50] Hybrid Attention for Chinese Character-Level Neural Machine Translation
    Wang, Feng
    Chen, Wei
    Yang, Zhen
    Xu, Shuang
    Xu, Bo
    NEUROCOMPUTING, 2019, 358 : 44 - 52