Learning to decode to future success for multi-modal neural machine translation

被引:2
|
作者
Huang, Yan [1 ]
Zhang, TianYuan [1 ]
Xu, Chun [2 ]
机构
[1] Zhengzhou Univ Light Ind, Coll Software Engn, Zhengzhou, Henan, Peoples R China
[2] Xinjiang Univ Finance & Econ, Coll Comp, Urumqi, Xinjiang, Peoples R China
来源
JOURNAL OF ENGINEERING RESEARCH | 2023年 / 11卷 / 02期
关键词
Neural machine translation; Multi-modal; Consistency; Visual annotation;
D O I
10.1016/j.jer.2023.100084
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Existing only-text NMT (neural machine translation) systems can benefit from explicitly modelling target future contexts as recurrent states. However, the modelled target future context is implicit in the conventional only-text NMT as the target is invisible in inference. For the Multi-modal Neural Machine Translation (MNMT), the visual annotation presents the content described in the bilingual parallel sentence pair, so-called multi-modal consistency. This consistency provides an advantage that future target context can be simulated in visual features. This paper proposes a novel translation model that allows MNMT to encode the future target context from the visual annotation in auto-regressive decoding. Our model uses visual-target consistency to enhance the target generation. Moreover, we use the multi-modal consistency that fully uses the visual annotation to encourage the semantic agreement between bilingual parallel sentences and the pivoted visual annotation. Empirical results on several recent multi-model translation datasets demonstrated the MNMT model which we proposed significantly improved translation performance on a strong baseline, especially achieving new state-of-the-art results on all three language pairs with visual annotations. Our code will be available after acceptance.
引用
收藏
页数:7
相关论文
共 50 条
  • [11] Imaginations Generate Images for Multi-modal Machine Translation
    Yang, Xiaona
    Sun, Wenli
    Wei, Wei
    Li, Yinlin
    Shi, Xiayang
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND NETWORKS, VOL II, CENET 2023, 2024, 1126 : 120 - 128
  • [12] A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation
    Yin, Yongjing
    Meng, Fandong
    Su, Jinsong
    Zhou, Chulun
    Yang, Zhengyuan
    Zhou, Jie
    Luo, Jiebo
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3025 - 3035
  • [13] Multi-modal simultaneous machine translation fusion of image information
    Huang, Yan
    Wanga, Zhanyang
    Zhang, TianYuan
    Xu, Chun
    Lianga, Hui
    JOURNAL OF ENGINEERING RESEARCH, 2023, 11 (02):
  • [14] Multi-Modal Approaches for Post-Editing Machine Translation
    Herbig, Nico
    Pal, Santanu
    van Genabith, Josef
    Krueger, Antonio
    CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019,
  • [15] Visual Agreement Regularized Training for Multi-Modal Machine Translation
    Yang, Pengcheng
    Chen, Boxing
    Zhang, Pei
    Sun, Xu
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9418 - 9425
  • [16] Progressive modality-complement aggregative multitransformer for domain multi-modal neural machine translation
    Guo, Junjun
    Hou, Zhenyu
    Xian, Yantuan
    Yu, Zhengtao
    PATTERN RECOGNITION, 2024, 149
  • [17] A generic neural network for multi-modal sensorimotor learning
    Carenzi, F
    Bendahan, P
    Roschin, VY
    Frolov, AA
    Gorce, P
    Maier, MA
    COMPUTATIONAL NEUROSCIENCE: TRENDS IN RESEARCH 2004, 2004, : 525 - 533
  • [18] A generic neural network for multi-modal sensorimotor learning
    Carenzi, F
    Bendahan, P
    Roschin, VY
    Frolov, AA
    Gorce, P
    Maier, MA
    NEUROCOMPUTING, 2004, 58 : 525 - 533
  • [19] Layer-Level Progressive Transformer With Modality Difference Awareness for Multi-Modal Neural Machine Translation
    Guo, Junjun
    Ye, Junjie
    Xiang, Yan
    Yu, Zhengtao
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3015 - 3026
  • [20] Dual-level interactive multimodal-mixup encoder for multi-modal neural machine translation
    Junjie Ye
    Junjun Guo
    Applied Intelligence, 2022, 52 : 14194 - 14203