Progressive modality-complement aggregative multitransformer for domain multi-modal neural machine translation

被引:2
|
作者
Guo, Junjun [1 ,2 ]
Hou, Zhenyu [1 ,2 ]
Xian, Yantuan [1 ,2 ]
Yu, Zhengtao [1 ,2 ]
机构
[1] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Kunming, Peoples R China
[2] Yunnan Key Lab Artificial Intelligence, Kunming 650504, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
Domain multi-modal neural machine; translation; Multi-modal transformer; Progressive modality-complement; Modality-specific mask;
D O I
10.1016/j.patcog.2024.110294
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Domain -specific Multi -modal Neural Machine Translation (DMNMT) aims to translate domain -specific sentences from a source language to a target language by incorporating text -related visual information. Generally, domain -specific text -image data often complement each other and have the potential to collaboratively enhance the representation of domain -specific information. Unfortunately, there is a considerable modality gap between image and text in data format and semantic expression, which leads to distinctive challenges in domain -text translation tasks. Narrowing the modality gap and improving domain -aware representation are two critical challenges in DMNMT. To this end, this paper proposes a progressive modality -complement aggregative MultiTransformer, which aims to simultaneously narrow the modality gap and capture domain -specific multimodal representation. We first adopt a bidirectional progressive cross -modal interactive strategy to effectively narrow the text -to -text, text -to -visual, and visual -to -text semantics in the multi -modal representation space by integrating visual and text information layer -by -layer. Subsequently, we introduce a modality -complement MultiTransformer based on progressive cross -modal interaction to extract the domain -related multi -modal representation, thereby enhancing machine translation performance. Experiment results on the Fashion-MMT and Multi -30k datasets are conducted, and the results show that the proposed approach outperforms the compared state-of-the-art (SOTA) methods on the En-Zh task in E -commerce domain, En -De, En -Fr and En -Cs tasks of Multi -30k in general domain. The in-depth analysis confirms the validity of the proposed modality -complement MultiTransformer and bidirectional progressive cross -modal interactive strategy for DMNMT.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] An Ensemble Strategy with Gradient Conflict for Multi-Domain Neural Machine Translation
    Man, Zhibo
    Zhang, Yujie
    Li, Yu
    Chen, Yuanmeng
    Chen, Yufeng
    Xu, Jinan
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (02)
  • [42] Utility of Multi-Modal MRI for Differentiating of Parkinson's Disease and Progressive Supranuclear Palsy Using Machine Learning
    Talai, Aron S.
    Sedlacik, Jan
    Boelmans, Kai
    Forkert, Nils D.
    FRONTIERS IN NEUROLOGY, 2021, 12
  • [43] Semantics-aware Multi-modal Domain Translation: From LiDAR Point Clouds to Panoramic Color Images
    Cortinhal, Tiago
    Kurnaz, Fatih
    Aksoy, Eren Erdal
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3032 - 3041
  • [44] BDANN: BERT-Based Domain Adaptation Neural Network for Multi-Modal Fake News Detection
    Zhang, Tong
    Wang, Di
    Chen, Huanhuan
    Zeng, Zhiwei
    Guo, Wei
    Miaoz, Chunyan
    Cui, Lizhen
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [45] Multi-modal transcriptomics: integrating machine learning and convolutional neural networks to identify immune biomarkers in atherosclerosis
    Chen, Haiqing
    Lai, Haotian
    Chi, Hao
    Fan, Wei
    Huang, Jinbang
    Zhang, Shengke
    Jiang, Chenglu
    Jiang, Lai
    Hu, Qingwen
    Yan, Xiuben
    Chen, Yemeng
    Zhang, Jieying
    Yang, Guanhu
    Liao, Bin
    Wan, Juyi
    FRONTIERS IN CARDIOVASCULAR MEDICINE, 2024, 11
  • [46] Word-Based Domain Feature-Sensitive Multi-domain Neural Machine Translation
    Huang Z.
    Man Z.
    Zhang Y.
    Xu J.
    Chen Y.
    Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2023, 59 (01): : 1 - 10
  • [47] Exploring Discriminative Word-Level Domain Contexts for Multi-Domain Neural Machine Translation
    Su, Jinsong
    Zeng, Jiali
    Xie, Jun
    Wen, Huating
    Yin, Yongjing
    Liu, Yang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) : 1530 - 1545
  • [48] Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation
    Ran, Qiu
    Lin, Yankai
    Li, Peng
    Zhou, Jie
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3059 - 3069
  • [49] Building a Multi-Domain Neural Machine Translation Model Using Knowledge Distillation
    Mghabbar, Idriss
    Ratnamogan, Pirashanth
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2116 - 2123
  • [50] MMPE: A Multi-Modal Interface Using Handwriting, Touch Reordering, and Speech Commands for Post-Editing Machine Translation
    Herbig, Nico
    Pal, Santanu
    Duewel, Tim
    Meladaki, Kalliopi
    Monshizadeh, Mahsa
    Hnatovskiy, Vladislav
    Krueger, Antonio
    van Genabith, Josef
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): SYSTEM DEMONSTRATIONS, 2020, : 327 - 334