Multi-modal simultaneous machine translation fusion of image information

被引：1

作者：

Huang, Yan ^{[1
]}

Wanga, Zhanyang ^{[1
]}

Zhang, TianYuan ^{[1
]}

Xu, Chun ^{[2
]}

Lianga, Hui ^{[1
]}

机构：

[1] Zhengzhou Univ Light Ind, Coll Software Engn, Zhengzhou, Henan, Peoples R China

[2] Xinjiang Univ Finance & Econ, Coll Comp, Urumqi, Xinjiang, Peoples R China

来源：

JOURNAL OF ENGINEERING RESEARCH | 2023年 / 11卷 / 02期

关键词：

Simultaneous translation; Real-time; Surrounding scenes; Multi-modal; Image information;

D O I：

10.1016/j.jer.2023.100085

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Simultaneous translation is to translate a sentence before people finish it, to understand the speaker's intention in real-time. At present, simultaneous machine translation still relies on text-to-text data resources. However, the output information from the encoder side is used for the decoder as the input data recourse in the pure text translation system. This information is only derived from the text content, and the input information is single, causing a shortage of decoding information at the decoder and the vocabulary is missed in translation. The translator will also visually capture the information of the surrounding scenes to assist himself in the translation work, based on this feature, we propose a multi-modal simultaneous machine translation of fusion image information. We extract information from the image, add the information to the decoder side of the translation system, increase the input data resource of the decoder, and help the system improve the translation quality. We use the Multi30K dataset for experimental verification. Compared with the translation system of plain text, the method we propose can translate more complete sentences, richer content, and better translation results.

引用

页数：7

共 50 条

[11] Contrastive Adversarial Training for Multi-Modal Machine Translation
Huang, Xin
Zhang, Jiajun
Zong, Chengqing
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (06)
[12] Imaginations Generate Images for Multi-modal Machine Translation
Yang, Xiaona
Sun, Wenli
Wei, Wei
Li, Yinlin
Shi, Xiayang
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND NETWORKS, VOL II, CENET 2023, 2024, 1126 : 120 - 128
[13] Toward Multi-Modal Conditioned Fashion Image Translation
Gu, Xiaoling
Yu, Jun
Wong, Yongkang
Kankanhalli, Mohan S.
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 2361 - 2371
[14] Guided Image Deblurring by Deep Multi-Modal Image Fusion
Liu, Yuqi
Sheng, Zehua
Shen, Hui-Liang
IEEE ACCESS, 2022, 10 : 130708 - 130718
[15] Multi-modal Fusion
Liu, Huaping
Hussain, Amir
Wang, Shuliang
INFORMATION SCIENCES, 2018, 432 : 462 - 462
[16] Multi-modal feature fusion for geographic image annotation
Li, Ke
Zou, Changqing
Bu, Shuhui
Liang, Yun
Zhang, Jian
Gong, Minglun
PATTERN RECOGNITION, 2018, 73 : 1 - 14
[17] A novel multi-modal medical image fusion algorithm
Xinhua Li
Jing Zhao
Journal of Ambient Intelligence and Humanized Computing, 2021, 12 : 1995 - 2002
[18] Image and Encoded Text Fusion for Multi-Modal Classification
Gallo, I.
Calefati, A.
Nawaz, S.
Janjua, M. K.
2018 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2018, : 203 - 209
[19] A novel multi-modal medical image fusion algorithm
Li, Xinhua
Zhao, Jing
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (02) : 1995 - 2002
[20] Multi-modal cardiac image fusion and visualization on the GPU
Kiss, Gabriel
Asen, Jon Petter
Bogaert, Jan
Amundsen, Brage
Claus, Piet
D'hooge, Jan
Torp, Hans G.
2011 IEEE INTERNATIONAL ULTRASONICS SYMPOSIUM (IUS), 2011, : 254 - 257

← 1 2 3 4 5 →