Multi-modal simultaneous machine translation fusion of image information

被引：1

作者：

Huang, Yan ^{[1
]}

Wanga, Zhanyang ^{[1
]}

Zhang, TianYuan ^{[1
]}

Xu, Chun ^{[2
]}

Lianga, Hui ^{[1
]}

机构：

[1] Zhengzhou Univ Light Ind, Coll Software Engn, Zhengzhou, Henan, Peoples R China

[2] Xinjiang Univ Finance & Econ, Coll Comp, Urumqi, Xinjiang, Peoples R China

来源：

JOURNAL OF ENGINEERING RESEARCH | 2023年 / 11卷 / 02期

关键词：

Simultaneous translation; Real-time; Surrounding scenes; Multi-modal; Image information;

D O I：

10.1016/j.jer.2023.100085

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Simultaneous translation is to translate a sentence before people finish it, to understand the speaker's intention in real-time. At present, simultaneous machine translation still relies on text-to-text data resources. However, the output information from the encoder side is used for the decoder as the input data recourse in the pure text translation system. This information is only derived from the text content, and the input information is single, causing a shortage of decoding information at the decoder and the vocabulary is missed in translation. The translator will also visually capture the information of the surrounding scenes to assist himself in the translation work, based on this feature, we propose a multi-modal simultaneous machine translation of fusion image information. We extract information from the image, add the information to the decoder side of the translation system, increase the input data resource of the decoder, and help the system improve the translation quality. We use the Multi30K dataset for experimental verification. Compared with the translation system of plain text, the method we propose can translate more complete sentences, richer content, and better translation results.

引用

页数：7

共 50 条

[21] Fusion of auxiliary information for multi-modal biometrics authentication
Toh, KA
Yau, WY
Lim, E
Chen, L
Ng, CH
BIOMETRIC AUTHENTICATION, PROCEEDINGS, 2004, 3072 : 678 - 685
[22] A Multi-Modal Incompleteness Ontology model (MMIO) to enhance information fusion for image retrieval
Poslad, Stefan
Kesorn, Kraisak
INFORMATION FUSION, 2014, 20 : 225 - 241
[23] MULTI-MODAL INFORMATION FUSION FOR CLASSIFICATION OF KIDNEY ABNORMALITIES
Varsha, S.
Nasser, Sahar Almahfouz
Bala, Gouranga
Kurian, Nikhil Cherian
Sethi, Amit
2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING CHALLENGES (IEEE ISBI 2022), 2022,
[24] Multi-Modal Information Fusion for Localization of Emergency Vehicles
Joshi, Aruna Kumar
Kulkarni, Shrinivasrao B.
INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2024,
[25] Self-supervised multi-modal fusion network for multi-modal thyroid ultrasound image diagnosis
Xiang, Zhuo
Zhuo, Qiuluan
Zhao, Cheng
Deng, Xiaofei
Zhu, Ting
Wang, Tianfu
Jiang, Wei
Lei, Baiying
COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 150
[26] Multi-Modal Approaches for Post-Editing Machine Translation
Herbig, Nico
Pal, Santanu
van Genabith, Josef
Krueger, Antonio
CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019,
[27] Multi-modal neural machine translation with deep semantic interactions
Su, Jinsong
Chen, Jinchang
Jiang, Hui
Zhou, Chulun
Lin, Huan
Ge, Yubin
Wu, Qingqiang
Lai, Yongxuan
INFORMATION SCIENCES, 2021, 554 : 47 - 60
[28] Visual Agreement Regularized Training for Multi-Modal Machine Translation
Yang, Pengcheng
Chen, Boxing
Zhang, Pei
Sun, Xu
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9418 - 9425
[29] Multi-modal graph contrastive encoding for neural machine translation
Yin, Yongjing
Zeng, Jiali
Su, Jinsong
Zhou, Chulun
Meng, Fandong
Zhou, Jie
Huang, Degen
Luo, Jiebo
ARTIFICIAL INTELLIGENCE, 2023, 323
[30] Image information system: Towards multi-modal imaging
Pagonis, D
Cinquin, P
BULLETIN DU CANCER, 1995, 82 : S520 - S529

← 1 2 3 4 5 →