Multi-modal simultaneous machine translation fusion of image information

被引:1
|
作者
Huang, Yan [1 ]
Wanga, Zhanyang [1 ]
Zhang, TianYuan [1 ]
Xu, Chun [2 ]
Lianga, Hui [1 ]
机构
[1] Zhengzhou Univ Light Ind, Coll Software Engn, Zhengzhou, Henan, Peoples R China
[2] Xinjiang Univ Finance & Econ, Coll Comp, Urumqi, Xinjiang, Peoples R China
来源
JOURNAL OF ENGINEERING RESEARCH | 2023年 / 11卷 / 02期
关键词
Simultaneous translation; Real-time; Surrounding scenes; Multi-modal; Image information;
D O I
10.1016/j.jer.2023.100085
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Simultaneous translation is to translate a sentence before people finish it, to understand the speaker's intention in real-time. At present, simultaneous machine translation still relies on text-to-text data resources. However, the output information from the encoder side is used for the decoder as the input data recourse in the pure text translation system. This information is only derived from the text content, and the input information is single, causing a shortage of decoding information at the decoder and the vocabulary is missed in translation. The translator will also visually capture the information of the surrounding scenes to assist himself in the translation work, based on this feature, we propose a multi-modal simultaneous machine translation of fusion image information. We extract information from the image, add the information to the decoder side of the translation system, increase the input data resource of the decoder, and help the system improve the translation quality. We use the Multi30K dataset for experimental verification. Compared with the translation system of plain text, the method we propose can translate more complete sentences, richer content, and better translation results.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] Fusion of auxiliary information for multi-modal biometrics authentication
    Toh, KA
    Yau, WY
    Lim, E
    Chen, L
    Ng, CH
    BIOMETRIC AUTHENTICATION, PROCEEDINGS, 2004, 3072 : 678 - 685
  • [22] A Multi-Modal Incompleteness Ontology model (MMIO) to enhance information fusion for image retrieval
    Poslad, Stefan
    Kesorn, Kraisak
    INFORMATION FUSION, 2014, 20 : 225 - 241
  • [23] MULTI-MODAL INFORMATION FUSION FOR CLASSIFICATION OF KIDNEY ABNORMALITIES
    Varsha, S.
    Nasser, Sahar Almahfouz
    Bala, Gouranga
    Kurian, Nikhil Cherian
    Sethi, Amit
    2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING CHALLENGES (IEEE ISBI 2022), 2022,
  • [24] Multi-Modal Information Fusion for Localization of Emergency Vehicles
    Joshi, Aruna Kumar
    Kulkarni, Shrinivasrao B.
    INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2024,
  • [25] Self-supervised multi-modal fusion network for multi-modal thyroid ultrasound image diagnosis
    Xiang, Zhuo
    Zhuo, Qiuluan
    Zhao, Cheng
    Deng, Xiaofei
    Zhu, Ting
    Wang, Tianfu
    Jiang, Wei
    Lei, Baiying
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 150
  • [26] Multi-Modal Approaches for Post-Editing Machine Translation
    Herbig, Nico
    Pal, Santanu
    van Genabith, Josef
    Krueger, Antonio
    CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019,
  • [27] Multi-modal neural machine translation with deep semantic interactions
    Su, Jinsong
    Chen, Jinchang
    Jiang, Hui
    Zhou, Chulun
    Lin, Huan
    Ge, Yubin
    Wu, Qingqiang
    Lai, Yongxuan
    INFORMATION SCIENCES, 2021, 554 : 47 - 60
  • [28] Visual Agreement Regularized Training for Multi-Modal Machine Translation
    Yang, Pengcheng
    Chen, Boxing
    Zhang, Pei
    Sun, Xu
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9418 - 9425
  • [29] Multi-modal graph contrastive encoding for neural machine translation
    Yin, Yongjing
    Zeng, Jiali
    Su, Jinsong
    Zhou, Chulun
    Meng, Fandong
    Zhou, Jie
    Huang, Degen
    Luo, Jiebo
    ARTIFICIAL INTELLIGENCE, 2023, 323
  • [30] Image information system: Towards multi-modal imaging
    Pagonis, D
    Cinquin, P
    BULLETIN DU CANCER, 1995, 82 : S520 - S529