Progressive modality-complement aggregative multitransformer for domain multi-modal neural machine translation

被引:2
|
作者
Guo, Junjun [1 ,2 ]
Hou, Zhenyu [1 ,2 ]
Xian, Yantuan [1 ,2 ]
Yu, Zhengtao [1 ,2 ]
机构
[1] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Kunming, Peoples R China
[2] Yunnan Key Lab Artificial Intelligence, Kunming 650504, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
Domain multi-modal neural machine; translation; Multi-modal transformer; Progressive modality-complement; Modality-specific mask;
D O I
10.1016/j.patcog.2024.110294
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Domain -specific Multi -modal Neural Machine Translation (DMNMT) aims to translate domain -specific sentences from a source language to a target language by incorporating text -related visual information. Generally, domain -specific text -image data often complement each other and have the potential to collaboratively enhance the representation of domain -specific information. Unfortunately, there is a considerable modality gap between image and text in data format and semantic expression, which leads to distinctive challenges in domain -text translation tasks. Narrowing the modality gap and improving domain -aware representation are two critical challenges in DMNMT. To this end, this paper proposes a progressive modality -complement aggregative MultiTransformer, which aims to simultaneously narrow the modality gap and capture domain -specific multimodal representation. We first adopt a bidirectional progressive cross -modal interactive strategy to effectively narrow the text -to -text, text -to -visual, and visual -to -text semantics in the multi -modal representation space by integrating visual and text information layer -by -layer. Subsequently, we introduce a modality -complement MultiTransformer based on progressive cross -modal interaction to extract the domain -related multi -modal representation, thereby enhancing machine translation performance. Experiment results on the Fashion-MMT and Multi -30k datasets are conducted, and the results show that the proposed approach outperforms the compared state-of-the-art (SOTA) methods on the En-Zh task in E -commerce domain, En -De, En -Fr and En -Cs tasks of Multi -30k in general domain. The in-depth analysis confirms the validity of the proposed modality -complement MultiTransformer and bidirectional progressive cross -modal interactive strategy for DMNMT.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Optimizing Machine Translation Algorithms through Empirical Study of Multi-modal Information Fusion
    Zhong Xuewen
    2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE COMPUTING AND SMART SYSTEMS, ICSCSS 2024, 2024, : 1336 - 1341
  • [32] Factorized Transformer for Multi-Domain Neural Machine Translation
    Deng, Yongchao
    Yu, Hongfei
    Yu, Heng
    Duan, Xiangyu
    Luo, Weihua
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4221 - 4230
  • [33] Adding visual attention into encoder-decoder model for multi-modal machine translation
    Xu, Chun
    Yu, Zhengqing
    Shi, Xiayang
    Chen, Fang
    JOURNAL OF ENGINEERING RESEARCH, 2023, 11 (02):
  • [34] Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey
    Saunders D.
    Journal of Artificial Intelligence Research, 2022, 75 : 351 - 424
  • [35] Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey
    Saunders, Danielle
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 75 : 351 - 424
  • [36] Multi-modal rumour detection using bilinear pooling and domain adversarial neural networks
    Wang C.
    Zhang H.
    Zhang J.
    Gu L.
    International Journal of Security and Networks, 2023, 18 (03) : 175 - 188
  • [37] Multi-modal indicators for estimating perceived cognitive load in post-editing of machine translation
    Herbig, Nico
    Pal, Santanu
    Vela, Mihaela
    Krueger, Antonio
    van Genabith, Josef
    MACHINE TRANSLATION, 2019, 33 (1-2) : 91 - 115
  • [38] Domain-Aware Self-Attention for Multi-Domain Neural Machine Translation
    Zhang, Shiqi
    Liu, Yan
    Xiong, Deyi
    Zhang, Pei
    Chen, Boxing
    INTERSPEECH 2021, 2021, : 2047 - 2051
  • [39] A multi-domain adaptive neural machine translation method based on domain data balancer
    Xu, Jinlei
    Wen, Yonghua
    Huang, Shuanghong
    Yu, Zhengtao
    INTELLIGENT DATA ANALYSIS, 2024, 28 (03) : 685 - 698
  • [40] Multi-Domain Neural Machine Translation with Word-Level Domain Context Discrimination
    Zeng, Jiali
    Su, Jinsong
    Wen, Huating
    Liu, Yang
    Xie, Jun
    Yin, Yongjing
    Zhao, Jianqiang
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 447 - 457