Text-image matching for multi-model machine translation

被引:3
|
作者
Shi, Xiayang [1 ]
Yu, Zhenqiang [2 ]
Wang, Xuhui [3 ]
Li, Yijun [3 ]
Niu, Yufeng [3 ]
机构
[1] Zhengzhou Univ Light Ind, Coll Software Engn, Dongfeng Rd, Zhengzhou 450003, Peoples R China
[2] Zhengzhou Univ Light Ind, Coll Math & Informat Sci, Dongfeng Rd, Zhengzhou 450003, Peoples R China
[3] Inst Stand Measurement ShanXi Prov, Inspection & Testing Ctr ShanXi Prov, Changzhi Rd, Taiyuan 030000, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2023年 / 79卷 / 16期
关键词
Multi-modal; Text-Image Matching; Similarity; Machine translation;
D O I
10.1007/s11227-023-05318-9
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-modal machine translation (MMT) aims to use other modal information to assist text machine translation and to obtain higher quality translation results. Many studies have proved that image information can improve the quality of text machine translation. However, the multi-modal data corpus used in the translation process needs a lot of manual annotation, which makes it difficult to label the corpus, and the scarcity of data sets affects the work of multi-modal machine translation to a certain extent. To solve the problem of text-image annotation, we propose a text-image similarity matching method. This method encodes the text and image, maps them to vector space, and uses cosine similarity to obtain the image with the greatest similarity to the text to construct a multi-modal dataset. We conducted experiments on the Multi30K English German text-only corpus and the WMT21 English Hindi text-only corpus, and the experimental results showed that our method improved 8.4 BLEU compared to the text-only translation results on the Multi30K corpus. Compared with manually annotated multi-modal datasets, our method improves 4.2 BLEU. At the same time, it has improved 3.4 BLEU on low resource corpus English-Hindi, so our method can effectively improve the construction of multi-modal machine translation data sets, and to some extent, improve the development of multi-modal machine translation research.
引用
收藏
页码:17810 / 17823
页数:14
相关论文
共 50 条
  • [1] Text-image matching for multi-model machine translation
    Xiayang Shi
    Zhenqiang Yu
    Xuhui Wang
    Yijun Li
    Yufeng Niu
    The Journal of Supercomputing, 2023, 79 : 17810 - 17823
  • [2] A Strong and Robust Baseline for Text-Image Matching
    Liu, Fangyu
    Ye, Rongtian
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 169 - 176
  • [3] Image Annotation as Text-Image Matching: Challenge Design and Results
    Pellegrin, Luis
    Loyola-Gonzalez, Octavio
    Ortiz-Bejar, Jose
    Angel Medina-Perez, Miguel
    Eduardo Gutierrez-Rodriguez, Andres
    Tellez, Eric S.
    Graff, Mario
    Miranda-Jimenez, Sabino
    Moctezuma, Daniela
    Garcia-Limon, Mauricio
    Morales-Reyes, Alicia
    Reyes-Garcia, Carlos A.
    Morales, Eduardo
    Jair Escalante, Hugo
    COMPUTACION Y SISTEMAS, 2019, 23 (04): : 1305 - 1321
  • [4] Dual-grained Text-Image Olfactory Matching Model with Mutual Promotion Stages
    Shao, Yi
    Sun, Jiande
    Jiang, Ye
    Li, Jing
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 669 - 677
  • [5] Overview of the 2017 RedICA Text-Image Matching (RICATIM) Challenge
    Pellegrin, Luis
    Jair Escalante, Hugo
    Morales, Alicia
    Morales, Eduardo F.
    Reyes-Garcia, Carlos A.
    2017 IEEE INTERNATIONAL AUTUMN MEETING ON POWER, ELECTRONICS AND COMPUTING (ROPEC), 2017,
  • [6] Multi-model geometrical fitting for wide baseline image matching
    Fan, Lixin
    Pylvanainen, Timo
    IMAGE PROCESSING: MACHINE VISION APPLICATIONS, 2008, 6813
  • [7] Leverage Boosting and Transformer on Text-Image Matching for Cheap Fakes Detection
    Tuan-Vinh La
    Dao, Minh-Son
    Le, Duy-Dong
    Thai, Kim-Phung
    Nguyen, Quoc-Hung
    Phan-Thi, Thuy-Kieu
    ALGORITHMS, 2022, 15 (11)
  • [8] HAL: Improved Text-Image Matching by Mitigating Visual Semantic Hubs
    Liu, Fangyu
    Ye, Rongtian
    Wang, Xun
    Li, Shuaipeng
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11563 - 11571
  • [9] TimNet: A text-image matching network integrating multi-stage feature extraction with multi-scale metrics
    Zheng, Xiaoqi
    Tao, Yingfan
    Zhang, Ruikai
    Yang, Wenming
    Liao, Qingmin
    NEUROCOMPUTING, 2021, 465 : 540 - 548
  • [10] Text-image communication, image-text communication
    Münkner, J
    ZEITSCHRIFT FUR GERMANISTIK, 2004, 14 (02): : 454 - 455