Multi-modal simultaneous machine translation fusion of image information

被引:1
|
作者
Huang, Yan [1 ]
Wanga, Zhanyang [1 ]
Zhang, TianYuan [1 ]
Xu, Chun [2 ]
Lianga, Hui [1 ]
机构
[1] Zhengzhou Univ Light Ind, Coll Software Engn, Zhengzhou, Henan, Peoples R China
[2] Xinjiang Univ Finance & Econ, Coll Comp, Urumqi, Xinjiang, Peoples R China
来源
JOURNAL OF ENGINEERING RESEARCH | 2023年 / 11卷 / 02期
关键词
Simultaneous translation; Real-time; Surrounding scenes; Multi-modal; Image information;
D O I
10.1016/j.jer.2023.100085
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Simultaneous translation is to translate a sentence before people finish it, to understand the speaker's intention in real-time. At present, simultaneous machine translation still relies on text-to-text data resources. However, the output information from the encoder side is used for the decoder as the input data recourse in the pure text translation system. This information is only derived from the text content, and the input information is single, causing a shortage of decoding information at the decoder and the vocabulary is missed in translation. The translator will also visually capture the information of the surrounding scenes to assist himself in the translation work, based on this feature, we propose a multi-modal simultaneous machine translation of fusion image information. We extract information from the image, add the information to the decoder side of the translation system, increase the input data resource of the decoder, and help the system improve the translation quality. We use the Multi30K dataset for experimental verification. Compared with the translation system of plain text, the method we propose can translate more complete sentences, richer content, and better translation results.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Optimizing Machine Translation Algorithms through Empirical Study of Multi-modal Information Fusion
    Zhong Xuewen
    2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE COMPUTING AND SMART SYSTEMS, ICSCSS 2024, 2024, : 1336 - 1341
  • [2] Image Visual Attention Mechanism-based Global and Local Semantic Information Fusion for Multi-modal English Machine Translation
    Zhengzhou Railway Vocational and Technical College, Zhengzhou
    450000, China
    J. Comput., 2 (37-50): : 37 - 50
  • [3] Contextual Information Driven Multi-modal Medical Image Fusion
    Luo, Xiao-Qing
    Zhang, Zhan-Cheng
    Zhang, Bao-Cheng
    Wu, Xiao-Jun
    IETE TECHNICAL REVIEW, 2017, 34 (06) : 598 - 611
  • [4] Unsupervised Multi-modal Neural Machine Translation
    Su, Yuanhang
    Fan, Kai
    Nguyen Bach
    Kuo, C-C Jay
    Huang, Fei
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10474 - 10483
  • [5] An error analysis for image-based multi-modal neural machine translation
    Calixto, Iacer
    Liu, Qun
    MACHINE TRANSLATION, 2019, 33 (1-2) : 155 - 177
  • [6] Multi-modal Image Fusion with KNN Matting
    Zhang, Xia
    Lin, Hui
    Kang, Xudong
    Li, Shutao
    PATTERN RECOGNITION (CCPR 2014), PT II, 2014, 484 : 89 - 96
  • [7] An overview of multi-modal medical image fusion
    Du, Jiao
    Li, Weisheng
    Lu, Ke
    Xiao, Bin
    NEUROCOMPUTING, 2016, 215 : 3 - 20
  • [8] A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation
    Yin, Yongjing
    Meng, Fandong
    Su, Jinsong
    Zhou, Chulun
    Yang, Zhengyuan
    Zhou, Jie
    Luo, Jiebo
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3025 - 3035
  • [9] RetrievalMMT: Retrieval-Constrained Multi-Modal Prompt Learning for Multi-Modal Machine Translation
    Wang, Yan
    Zeng, Yawen
    Liang, Junjie
    Xing, Xiaofen
    Xu, Jin
    Xu, Xiangmin
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 860 - 868
  • [10] Video Pivoting Unsupervised Multi-Modal Machine Translation
    Li, Mingjie
    Huang, Po-Yao
    Chang, Xiaojun
    Hu, Junjie
    Yang, Yi
    Hauptmann, Alex
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3918 - 3932