Learning to decode to future success for multi-modal neural machine translation

被引:2
|
作者
Huang, Yan [1 ]
Zhang, TianYuan [1 ]
Xu, Chun [2 ]
机构
[1] Zhengzhou Univ Light Ind, Coll Software Engn, Zhengzhou, Henan, Peoples R China
[2] Xinjiang Univ Finance & Econ, Coll Comp, Urumqi, Xinjiang, Peoples R China
来源
JOURNAL OF ENGINEERING RESEARCH | 2023年 / 11卷 / 02期
关键词
Neural machine translation; Multi-modal; Consistency; Visual annotation;
D O I
10.1016/j.jer.2023.100084
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Existing only-text NMT (neural machine translation) systems can benefit from explicitly modelling target future contexts as recurrent states. However, the modelled target future context is implicit in the conventional only-text NMT as the target is invisible in inference. For the Multi-modal Neural Machine Translation (MNMT), the visual annotation presents the content described in the bilingual parallel sentence pair, so-called multi-modal consistency. This consistency provides an advantage that future target context can be simulated in visual features. This paper proposes a novel translation model that allows MNMT to encode the future target context from the visual annotation in auto-regressive decoding. Our model uses visual-target consistency to enhance the target generation. Moreover, we use the multi-modal consistency that fully uses the visual annotation to encourage the semantic agreement between bilingual parallel sentences and the pivoted visual annotation. Empirical results on several recent multi-model translation datasets demonstrated the MNMT model which we proposed significantly improved translation performance on a strong baseline, especially achieving new state-of-the-art results on all three language pairs with visual annotations. Our code will be available after acceptance.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] Dual-level interactive multimodal-mixup encoder for multi-modal neural machine translation
    Ye, Junjie
    Guo, Junjun
    APPLIED INTELLIGENCE, 2022, 52 (12) : 14194 - 14203
  • [22] MMPE: A Multi-Modal Interface for Post-Editing Machine Translation
    Herbig, Nico
    Duewel, Tim
    Pal, Santanu
    Meladaki, Kalliopi
    Monshizadeh, Mahsa
    Krueger, Antonio
    van Genabith, Josef
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1691 - 1702
  • [23] Multi-modal anchor adaptation learning for multi-modal summarization
    Chen, Zhongfeng
    Lu, Zhenyu
    Rong, Huan
    Zhao, Chuanjun
    Xu, Fan
    NEUROCOMPUTING, 2024, 570
  • [24] HybridVocab: Towards Multi-Modal Machine Translation via Multi-Aspect Alignment
    Peng, Ru
    Zeng, Yawen
    Zhao, Junbo
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 380 - 388
  • [25] Multi-modal Machine Learning Model for Interpretable Malware Classification
    Lisa, Fahmida Tasnim
    Islam, Sheikh Rabiul
    Kumar, Neha Mohan
    EXPLAINABLE ARTIFICIAL INTELLIGENCE, PT III, XAI 2024, 2024, 2155 : 334 - 349
  • [26] Multi-modal Machine Learning Investigation of Telework and Transit Connections
    Deirdre Edward
    Jason Soria
    Amanda Stathopoulos
    Data Science for Transportation, 2024, 6 (2):
  • [27] Multi-Modal Hate Speech Recognition Through Machine Learning
    1600, Institute of Electrical and Electronics Engineers Inc.
  • [28] Machine Learning of Multi-Modal Influences on Airport Pushback Delays
    Kicinger, Rafal
    Krozel, Jimmy
    Chen, Jit-Tat
    Schelling, Steven
    AIAA AVIATION FORUM AND ASCEND 2024, 2024,
  • [29] Machine Learning Based Multi-Modal Transportation Network Planner
    Manghat, Neeraj Menon
    Gopalakrishna, Vaishak
    Bonthu, Sai
    Hunt, Victor
    Helmicki, Arthur
    McClintock, Doug
    INTERNATIONAL CONFERENCE ON TRANSPORTATION AND DEVELOPMENT 2024: TRANSPORTATION SAFETY AND EMERGING TECHNOLOGIES, ICTD 2024, 2024, : 380 - 389
  • [30] Multi-modal Hate Speech Detection using Machine Learning
    Boishakhi, Fariha Tahosin
    Shill, Ponkoj Chandra
    Alam, Md Golam Rabiul
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 4496 - 4499