Contrastive Adversarial Training for Multi-Modal Machine Translation

被引:2
|
作者
Huang, Xin [1 ]
Zhang, Jiajun [1 ]
Zong, Chengqing [1 ]
机构
[1] Univ Chinese Acad Sci, Chinese Acad Sci, Sch Artificial Intelligence, Natl Lab Pattern Recognit,Inst Automat, Intelligence Bldg,95 Zhongguancun East Rd, Beijing 100190, Peoples R China
关键词
Contrastive Learning; adversarial training; multi-modal machine translation;
D O I
10.1145/3587267
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The multi-modal machine translation task is to improve translation quality with the help of additional visual input. It is expected to disambiguate or complement semantics while there are ambiguous words or incomplete expressions in the sentences. Existing methods have tried many ways to fuse visual information into text representations. However, only a minority of sentences need extra visual information as complementary. Without guidance, models tend to learn text-only translation from the major well-aligned translation pairs. In this article, we propose a contrastive adversarial training approach to enhance visual participation in semantic representation learning. By contrasting multi-modal input with the adversarial samples, the model learns to identify the most informed sample that is coupled with a congruent image and several visual objects extracted from it. This approach can prevent the visual information from being ignored and further fuse cross-modal information. We examine our method in three multi-modal language pairs. Experimental results show that our model is capable of improving translation accuracy. Further analysis shows that our model is more sensitive to visual information.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Multi-modal graph contrastive encoding for neural machine translation
    Yin, Yongjing
    Zeng, Jiali
    Su, Jinsong
    Zhou, Chulun
    Meng, Fandong
    Zhou, Jie
    Huang, Degen
    Luo, Jiebo
    ARTIFICIAL INTELLIGENCE, 2023, 323
  • [2] Visual Agreement Regularized Training for Multi-Modal Machine Translation
    Yang, Pengcheng
    Chen, Boxing
    Zhang, Pei
    Sun, Xu
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9418 - 9425
  • [3] Multi-Modal Contrastive Pre-training for Recommendation
    Liu, Zhuang
    Ma, Yunpu
    Schubert, Matthias
    Ouyang, Yuanxin
    Xiong, Zhang
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 99 - 108
  • [4] Unsupervised Multi-modal Neural Machine Translation
    Su, Yuanhang
    Fan, Kai
    Nguyen Bach
    Kuo, C-C Jay
    Huang, Fei
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10474 - 10483
  • [5] RetrievalMMT: Retrieval-Constrained Multi-Modal Prompt Learning for Multi-Modal Machine Translation
    Wang, Yan
    Zeng, Yawen
    Liang, Junjie
    Xing, Xiaofen
    Xu, Jin
    Xu, Xiangmin
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 860 - 868
  • [6] Video Pivoting Unsupervised Multi-Modal Machine Translation
    Li, Mingjie
    Huang, Po-Yao
    Chang, Xiaojun
    Hu, Junjie
    Yang, Yi
    Hauptmann, Alex
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3918 - 3932
  • [7] Imaginations Generate Images for Multi-modal Machine Translation
    Yang, Xiaona
    Sun, Wenli
    Wei, Wei
    Li, Yinlin
    Shi, Xiayang
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND NETWORKS, VOL II, CENET 2023, 2024, 1126 : 120 - 128
  • [8] Connecting Multi-modal Contrastive Representations
    Wang, Zehan
    Zhao, Yang
    Cheng, Xize
    Huang, Haifeng
    Liu, Jiageng
    Tang, Li
    Li, Linjun
    Wang, Yongqi
    Yin, Aoxiong
    Zhang, Ziang
    Zhao, Zhou
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [9] Multi-modal simultaneous machine translation fusion of image information
    Huang, Yan
    Wanga, Zhanyang
    Zhang, TianYuan
    Xu, Chun
    Lianga, Hui
    JOURNAL OF ENGINEERING RESEARCH, 2023, 11 (02):
  • [10] Multi-Modal Approaches for Post-Editing Machine Translation
    Herbig, Nico
    Pal, Santanu
    van Genabith, Josef
    Krueger, Antonio
    CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019,