Multi-modal simultaneous machine translation fusion of image information

被引:1
|
作者
Huang, Yan [1 ]
Wanga, Zhanyang [1 ]
Zhang, TianYuan [1 ]
Xu, Chun [2 ]
Lianga, Hui [1 ]
机构
[1] Zhengzhou Univ Light Ind, Coll Software Engn, Zhengzhou, Henan, Peoples R China
[2] Xinjiang Univ Finance & Econ, Coll Comp, Urumqi, Xinjiang, Peoples R China
来源
JOURNAL OF ENGINEERING RESEARCH | 2023年 / 11卷 / 02期
关键词
Simultaneous translation; Real-time; Surrounding scenes; Multi-modal; Image information;
D O I
10.1016/j.jer.2023.100085
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Simultaneous translation is to translate a sentence before people finish it, to understand the speaker's intention in real-time. At present, simultaneous machine translation still relies on text-to-text data resources. However, the output information from the encoder side is used for the decoder as the input data recourse in the pure text translation system. This information is only derived from the text content, and the input information is single, causing a shortage of decoding information at the decoder and the vocabulary is missed in translation. The translator will also visually capture the information of the surrounding scenes to assist himself in the translation work, based on this feature, we propose a multi-modal simultaneous machine translation of fusion image information. We extract information from the image, add the information to the decoder side of the translation system, increase the input data resource of the decoder, and help the system improve the translation quality. We use the Multi30K dataset for experimental verification. Compared with the translation system of plain text, the method we propose can translate more complete sentences, richer content, and better translation results.
引用
收藏
页数:7
相关论文
共 50 条
  • [11] Contrastive Adversarial Training for Multi-Modal Machine Translation
    Huang, Xin
    Zhang, Jiajun
    Zong, Chengqing
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (06)
  • [12] Imaginations Generate Images for Multi-modal Machine Translation
    Yang, Xiaona
    Sun, Wenli
    Wei, Wei
    Li, Yinlin
    Shi, Xiayang
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND NETWORKS, VOL II, CENET 2023, 2024, 1126 : 120 - 128
  • [13] Toward Multi-Modal Conditioned Fashion Image Translation
    Gu, Xiaoling
    Yu, Jun
    Wong, Yongkang
    Kankanhalli, Mohan S.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 2361 - 2371
  • [14] Guided Image Deblurring by Deep Multi-Modal Image Fusion
    Liu, Yuqi
    Sheng, Zehua
    Shen, Hui-Liang
    IEEE ACCESS, 2022, 10 : 130708 - 130718
  • [15] Multi-modal Fusion
    Liu, Huaping
    Hussain, Amir
    Wang, Shuliang
    INFORMATION SCIENCES, 2018, 432 : 462 - 462
  • [16] Multi-modal feature fusion for geographic image annotation
    Li, Ke
    Zou, Changqing
    Bu, Shuhui
    Liang, Yun
    Zhang, Jian
    Gong, Minglun
    PATTERN RECOGNITION, 2018, 73 : 1 - 14
  • [17] A novel multi-modal medical image fusion algorithm
    Xinhua Li
    Jing Zhao
    Journal of Ambient Intelligence and Humanized Computing, 2021, 12 : 1995 - 2002
  • [18] Image and Encoded Text Fusion for Multi-Modal Classification
    Gallo, I.
    Calefati, A.
    Nawaz, S.
    Janjua, M. K.
    2018 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2018, : 203 - 209
  • [19] A novel multi-modal medical image fusion algorithm
    Li, Xinhua
    Zhao, Jing
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (02) : 1995 - 2002
  • [20] Multi-modal cardiac image fusion and visualization on the GPU
    Kiss, Gabriel
    Asen, Jon Petter
    Bogaert, Jan
    Amundsen, Brage
    Claus, Piet
    D'hooge, Jan
    Torp, Hans G.
    2011 IEEE INTERNATIONAL ULTRASONICS SYMPOSIUM (IUS), 2011, : 254 - 257