Multi-modal simultaneous machine translation fusion of image information

被引:1
|
作者
Huang, Yan [1 ]
Wanga, Zhanyang [1 ]
Zhang, TianYuan [1 ]
Xu, Chun [2 ]
Lianga, Hui [1 ]
机构
[1] Zhengzhou Univ Light Ind, Coll Software Engn, Zhengzhou, Henan, Peoples R China
[2] Xinjiang Univ Finance & Econ, Coll Comp, Urumqi, Xinjiang, Peoples R China
来源
JOURNAL OF ENGINEERING RESEARCH | 2023年 / 11卷 / 02期
关键词
Simultaneous translation; Real-time; Surrounding scenes; Multi-modal; Image information;
D O I
10.1016/j.jer.2023.100085
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Simultaneous translation is to translate a sentence before people finish it, to understand the speaker's intention in real-time. At present, simultaneous machine translation still relies on text-to-text data resources. However, the output information from the encoder side is used for the decoder as the input data recourse in the pure text translation system. This information is only derived from the text content, and the input information is single, causing a shortage of decoding information at the decoder and the vocabulary is missed in translation. The translator will also visually capture the information of the surrounding scenes to assist himself in the translation work, based on this feature, we propose a multi-modal simultaneous machine translation of fusion image information. We extract information from the image, add the information to the decoder side of the translation system, increase the input data resource of the decoder, and help the system improve the translation quality. We use the Multi30K dataset for experimental verification. Compared with the translation system of plain text, the method we propose can translate more complete sentences, richer content, and better translation results.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] Fabric image retrieval based on multi-modal feature fusion
    Zhang, Ning
    Liu, Yixin
    Li, Zhongjian
    Xiang, Jun
    Pan, Ruru
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (03) : 2207 - 2217
  • [42] A Multi-modal Medical Image Fusion Method in Spatial Domain
    Yan, Huibin
    Li, Zhongmin
    PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 597 - 601
  • [43] Efficient multi-modal fusion on supergraph for scalable image annotation
    Amiri, S. Hamid
    Jarnzad, Mansour
    PATTERN RECOGNITION, 2015, 48 (07) : 2241 - 2253
  • [44] Dynamic Deep Multi-modal Fusion for Image Privacy Prediction
    Tonge, Ashwini
    Caragea, Cornelia
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 1829 - 1840
  • [45] MOFA: A novel dataset for Multi-modal Image Fusion Applications
    Xiao, Kaihua
    Kang, Xudong
    Liu, Haibo
    Duan, Puhong
    INFORMATION FUSION, 2023, 96 : 144 - 155
  • [46] Incomplete multi-modal brain image fusion for epilepsy classification
    Zhu, Qi
    Li, Huijie
    Ye, Haizhou
    Zhang, Zhiqiang
    Wang, Ran
    Fan, Zizhu
    Zhang, Daoqiang
    INFORMATION SCIENCES, 2022, 582 : 316 - 333
  • [47] Deep Gated Multi-modal Fusion for Image Privacy Prediction
    Zhao, Chenye
    Caragea, Cornelia
    ACM TRANSACTIONS ON THE WEB, 2023, 17 (04)
  • [48] SWT and PCA image fusion methods for multi-modal imagery
    Rabia Bashir
    Riaz Junejo
    Nadia N. Qadri
    Martin Fleury
    Muhammad Yasir Qadri
    Multimedia Tools and Applications, 2019, 78 : 1235 - 1263
  • [49] CTFusion: Convolutions Integrate with Transformers for Multi-modal Image Fusion
    Shen, Zhengwen
    Wang, Jun
    Pan, Zaiyu
    Wang, Jiangyu
    Li, Yulian
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, PRCV 2022, 2022, 13534 : 488 - 498
  • [50] Dynamic Brightness Adaptation for Robust Multi-modal Image Fusion
    Sun, Yiming
    Cao, Bing
    Zhu, Pengfei
    Hu, Qinghua
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 1317 - 1325