Learning to decode to future success for multi-modal neural machine translation

被引:2
|
作者
Huang, Yan [1 ]
Zhang, TianYuan [1 ]
Xu, Chun [2 ]
机构
[1] Zhengzhou Univ Light Ind, Coll Software Engn, Zhengzhou, Henan, Peoples R China
[2] Xinjiang Univ Finance & Econ, Coll Comp, Urumqi, Xinjiang, Peoples R China
来源
JOURNAL OF ENGINEERING RESEARCH | 2023年 / 11卷 / 02期
关键词
Neural machine translation; Multi-modal; Consistency; Visual annotation;
D O I
10.1016/j.jer.2023.100084
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Existing only-text NMT (neural machine translation) systems can benefit from explicitly modelling target future contexts as recurrent states. However, the modelled target future context is implicit in the conventional only-text NMT as the target is invisible in inference. For the Multi-modal Neural Machine Translation (MNMT), the visual annotation presents the content described in the bilingual parallel sentence pair, so-called multi-modal consistency. This consistency provides an advantage that future target context can be simulated in visual features. This paper proposes a novel translation model that allows MNMT to encode the future target context from the visual annotation in auto-regressive decoding. Our model uses visual-target consistency to enhance the target generation. Moreover, we use the multi-modal consistency that fully uses the visual annotation to encourage the semantic agreement between bilingual parallel sentences and the pivoted visual annotation. Empirical results on several recent multi-model translation datasets demonstrated the MNMT model which we proposed significantly improved translation performance on a strong baseline, especially achieving new state-of-the-art results on all three language pairs with visual annotations. Our code will be available after acceptance.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] Learning Multi-modal Similarity
    McFee, Brian
    Lanckriet, Gert
    JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 491 - 523
  • [42] Multi-grained visual pivot-guided multi-modal neural machine translation with text-aware cross-modal contrastive disentangling
    Guo, Junjun
    Su, Rui
    Ye, Junjie
    NEURAL NETWORKS, 2024, 178
  • [43] A multi-modal machine learning approach towards predicting patient readmission
    Mohanty, Somya D.
    Lekan, Deborah
    McCoy, Thomas P.
    Jenkins, Marjorie
    Manda, Prashanti
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 2027 - 2035
  • [44] Multi-modal biomarkers of low back pain: A machine learning approach
    Lamichhane, Bidhan
    Jayasekera, Dinal
    Jakes, Rachel
    Glasser, Matthew F.
    Zhang, Justin
    Yang, Chunhui
    Grimes, Derayvia
    Frank, Tyler L.
    Ray, Wilson Z.
    Leuthardt, Eric C.
    Hawasli, Ammar H.
    NEUROIMAGE-CLINICAL, 2021, 29
  • [45] Predicting working alliance in psychotherapy: A multi-modal machine learning approach
    Aafjes-Van Doorn, Katie
    Cicconet, Marcelo
    Cohn, Jeffrey F.
    Aafjes, Marc
    PSYCHOTHERAPY RESEARCH, 2025, 35 (02) : 256 - 270
  • [46] Real Time Electrocardiogram Identification with Multi-modal Machine Learning Algorithms
    Waili, Tuerxun
    Nor, Rizal Mohd
    Sidek, Khairul Azami
    Rahman, Abdul Wahab Bin Abdul
    Guven, Gokhan
    RECENT TRENDS IN INFORMATION AND COMMUNICATION TECHNOLOGY, 2018, 5 : 459 - 466
  • [47] Multi-modal translation system and its evaluation
    Morishima, S
    Nakamura, S
    FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, : 241 - 246
  • [48] Latent Variable Model for Multi-modal Translation
    Calixto, Iacer
    Rios, Miguel
    Aziz, Wilker
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 6392 - 6405
  • [49] Multi-agent Learning for Neural Machine Translation
    Bi, Tianchi
    Xiong, Hao
    He, Zhongjun
    Wu, Hua
    Wang, Haifeng
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 856 - 865
  • [50] Optimizing Machine Translation Algorithms through Empirical Study of Multi-modal Information Fusion
    Zhong Xuewen
    2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE COMPUTING AND SMART SYSTEMS, ICSCSS 2024, 2024, : 1336 - 1341