Learning to decode to future success for multi-modal neural machine translation

被引:2
|
作者
Huang, Yan [1 ]
Zhang, TianYuan [1 ]
Xu, Chun [2 ]
机构
[1] Zhengzhou Univ Light Ind, Coll Software Engn, Zhengzhou, Henan, Peoples R China
[2] Xinjiang Univ Finance & Econ, Coll Comp, Urumqi, Xinjiang, Peoples R China
来源
JOURNAL OF ENGINEERING RESEARCH | 2023年 / 11卷 / 02期
关键词
Neural machine translation; Multi-modal; Consistency; Visual annotation;
D O I
10.1016/j.jer.2023.100084
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Existing only-text NMT (neural machine translation) systems can benefit from explicitly modelling target future contexts as recurrent states. However, the modelled target future context is implicit in the conventional only-text NMT as the target is invisible in inference. For the Multi-modal Neural Machine Translation (MNMT), the visual annotation presents the content described in the bilingual parallel sentence pair, so-called multi-modal consistency. This consistency provides an advantage that future target context can be simulated in visual features. This paper proposes a novel translation model that allows MNMT to encode the future target context from the visual annotation in auto-regressive decoding. Our model uses visual-target consistency to enhance the target generation. Moreover, we use the multi-modal consistency that fully uses the visual annotation to encourage the semantic agreement between bilingual parallel sentences and the pivoted visual annotation. Empirical results on several recent multi-model translation datasets demonstrated the MNMT model which we proposed significantly improved translation performance on a strong baseline, especially achieving new state-of-the-art results on all three language pairs with visual annotations. Our code will be available after acceptance.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] Learning Confidence Measures by Multi-modal Convolutional Neural Networks
    Fu, Zehua
    Ardabilian Fard, Mohsen
    2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 1321 - 1330
  • [32] Learning multi-modal recurrent neural networks with target propagation
    Manchev, Nikolay
    Spratling, Michael
    COMPUTATIONAL INTELLIGENCE, 2024, 40 (04)
  • [33] Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation
    Abdulmumin, Idris
    Dash, Satya Ranjan
    Dawud, Musa Abdullahi
    Parida, Shantipriya
    Muhammad, Shamsuddeen Hassan
    Ahmad, Ibrahim Sa'id
    Panda, Subhadarshi
    Bojar, Ondrej
    Galadanci, Bashir Shehu
    Bello, Shehu Bello
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6471 - 6479
  • [34] Visual Topic Semantic Enhanced Machine Translation for Multi-Modal Data Efficiency
    Chao Wang
    Si-Jia Cai
    Bei-Xiang Shi
    Zhi-Hong Chong
    Journal of Computer Science and Technology, 2023, 38 : 1223 - 1236
  • [35] Noise-Robust Semi-supervised Multi-modal Machine Translation
    Li, Lin
    Hu, Kaixi
    Tayir, Turghun
    Liu, Jianquan
    Lee, Kong Aik
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2022, 13630 : 155 - 168
  • [36] Hindi Visual Genome: A Dataset for Multi-Modal English to Hindi Machine Translation
    Parida, Shantipriya
    Bojar, Ondrej
    Dash, Satya Ranjan
    COMPUTACION Y SISTEMAS, 2019, 23 (04): : 1499 - 1505
  • [37] Probing Multi-modal Machine Translation with Pre-trained Language Model
    Kong, Yawei
    Fan, Kai
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3689 - 3699
  • [38] Visual Topic Semantic Enhanced Machine Translation for Multi-Modal Data Efficiency
    Wang, Chao
    Cai, Si-Jia
    Shi, Bei-Xiang
    Chong, Zhi-Hong
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2023, 38 (06) : 1223 - 1236
  • [39] Multi-modal transcriptomics: integrating machine learning and convolutional neural networks to identify immune biomarkers in atherosclerosis
    Chen, Haiqing
    Lai, Haotian
    Chi, Hao
    Fan, Wei
    Huang, Jinbang
    Zhang, Shengke
    Jiang, Chenglu
    Jiang, Lai
    Hu, Qingwen
    Yan, Xiuben
    Chen, Yemeng
    Zhang, Jieying
    Yang, Guanhu
    Liao, Bin
    Wan, Juyi
    FRONTIERS IN CARDIOVASCULAR MEDICINE, 2024, 11
  • [40] Unsupervised Multi-modal Learning
    Iqbal, Mohammed Shameer
    ADVANCES IN ARTIFICIAL INTELLIGENCE (AI 2015), 2015, 9091 : 343 - 346