Text-image matching for multi-model machine translation

被引：3

作者：

Shi, Xiayang ^{[1
]}

Yu, Zhenqiang ^{[2
]}

Wang, Xuhui ^{[3
]}

Li, Yijun ^{[3
]}

Niu, Yufeng ^{[3
]}

机构：

[1] Zhengzhou Univ Light Ind, Coll Software Engn, Dongfeng Rd, Zhengzhou 450003, Peoples R China

[2] Zhengzhou Univ Light Ind, Coll Math & Informat Sci, Dongfeng Rd, Zhengzhou 450003, Peoples R China

[3] Inst Stand Measurement ShanXi Prov, Inspection & Testing Ctr ShanXi Prov, Changzhi Rd, Taiyuan 030000, Peoples R China

来源：

JOURNAL OF SUPERCOMPUTING | 2023年 / 79卷 / 16期

关键词：

Multi-modal; Text-Image Matching; Similarity; Machine translation;

D O I：

10.1007/s11227-023-05318-9

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multi-modal machine translation (MMT) aims to use other modal information to assist text machine translation and to obtain higher quality translation results. Many studies have proved that image information can improve the quality of text machine translation. However, the multi-modal data corpus used in the translation process needs a lot of manual annotation, which makes it difficult to label the corpus, and the scarcity of data sets affects the work of multi-modal machine translation to a certain extent. To solve the problem of text-image annotation, we propose a text-image similarity matching method. This method encodes the text and image, maps them to vector space, and uses cosine similarity to obtain the image with the greatest similarity to the text to construct a multi-modal dataset. We conducted experiments on the Multi30K English German text-only corpus and the WMT21 English Hindi text-only corpus, and the experimental results showed that our method improved 8.4 BLEU compared to the text-only translation results on the Multi30K corpus. Compared with manually annotated multi-modal datasets, our method improves 4.2 BLEU. At the same time, it has improved 3.4 BLEU on low resource corpus English-Hindi, so our method can effectively improve the construction of multi-modal machine translation data sets, and to some extent, improve the development of multi-modal machine translation research.

引用

页码：17810 / 17823

页数：14

共 50 条

[41] A Multi-model Biometric Image Acquisition System
Zhang, Haoxiang
BIOMETRIC RECOGNITION, CCBR 2015, 2015, 9428 : 516 - 525
[42] A Multi-Stage Deep Learning Approach Incorporating Text-Image and Image-Image Comparisons for Cheapfake Detection
Seo, Jangwon
Hwang, Hyo-Seok
Lee, Jiyoung
Lee, Minhyeok
Kim, Wonsuk
Seok, Junhee
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1312 - 1316
[43] Image to Text Translation by Multi-Label Classification
Nasierding, Gulisong
Kouzani, Abbas Z.
ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2010, 6216 : 247 - +
[44] THE TEXT-IMAGE RELATIONSHIP IN VERBETES OF AN ENGLISH LANGUAGE DICTIONARY
de Lima, Edmar Peixoto
Araujo, Edna M. Vasconcelos M.
Pontes, Antonio Luciano
DIALOGO DAS LETRAS, 2016, 5 (02): : 51 - 67
[45] Text-image Alignment for Diffusion-based Perception
Kondapanenil, Neehar
Marksl, Markus
Knott, Manuel
Guimaraes, Rogerio
Perona, Pietro
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13883 - 13893
[46] RELEVANCE AND MEANING OF TEXT-IMAGE INTERRELATION IN THE DECIMONONIC LITERATURE
Baquero Escudero, Ana L.
MONTEAGUDO, 2012, (17): : 183 - 188
[47] Text-Image Theory: A New Approach to Literary Semiotics
Yuping, Li
FORUM FOR WORLD LITERATURE STUDIES, 2022, 14 (02): : 357 - 365
[48] Experiences in evaluating multilingual and text-image information retrieval
Garcia-Serrano, Ana M.
Martinez-Fernandez, Jose L.
Martinez, Paloma
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2006, 21 (07) : 655 - 677
[49] A Learning to Rank framework applied to text-image retrieval
David Buffoni
Sabrina Tollari
Patrick Gallinari
Multimedia Tools and Applications, 2012, 60 : 161 - 180
[50] A Learning to Rank framework applied to text-image retrieval
Buffoni, David
Tollari, Sabrina
Gallinari, Patrick
MULTIMEDIA TOOLS AND APPLICATIONS, 2012, 60 (01) : 161 - 180

← 1 2 3 4 5 →