Review of Recent Deep Learning Based Methods for Image-Text Retrieval

被引：5

作者：

Chen, Jianan ^{[1
]}

Zhang, Lu ^{[1
]}

Bai, Cong ^{[2
]}

Kpalma, Kidiyo ^{[1
]}

机构：

[1] Univ Rennes, INSA Rennes, CNRS, UMR 6164,IETR, F-35000 Rennes, France

[2] Zhejiang Univ Technol, Coll Comp Sci, Hangzhou, Peoples R China

来源：

THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2020) | 2020年

关键词：

D O I：

10.1109/MIPR49039.2020.00042

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Cross-modal retrieval has drawn much attention in recent years due to the diversity and the quantity of information data that exploded with the popularity of mobile devices and social media. Extracting relevant information efficiently from large-scale multi-modal data is becoming a crucial problem of information retrieval. Cross-modal retrieval aims to retrieve relevant information across different modalities. In this paper, we highlight key points of recent cross-modal retrieval approaches based on deep-learning, especially in the image-text retrieval context, and classify them into four categories according to different embedding methods. Evaluations of state-of-the-art cross-modal retrieval methods on two benchmark datasets are shown at the end of this paper.

引用

页码：171 / 176

页数：6

共 50 条

[1] Dissecting Deep Metric Learning Losses for Image-Text Retrieval
Xuan, Hong
Chen, Xi
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2163 - 2172
[2] Compositional Learning of Image-Text Query for Image Retrieval
Anwaar, Muhammad Umer
Labintcev, Egor
Kleinsteuber, Martin
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1139 - 1148
[3] Visual context learning based on textual knowledge for image-text retrieval
Qin, Yuzhuo
Gu, Xiaodong
Tan, Zhenshan
NEURAL NETWORKS, 2022, 152 : 434 - 449
[4] Joint Image-text Representation Learning for Fashion Retrieval
Yan, Cairong
Li, Yu
Wan, Yongquan
Zhang, Zhaohui
ICMLC 2020: 2020 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2018, : 412 - 417
[5] The Research on Image-text Fusion Emotion Recognition Based on Deep Learning
Wang, Xuyang
Zhang, Xin
PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON COMPUTER AND MULTIMEDIA TECHNOLOGY, ICCMT 2024, 2024, : 124 - 128
[6] Image-text bidirectional learning network based cross-modal retrieval
Li, Zhuoyi
Lu, Huibin
Fu, Hao
Gu, Guanghua
NEUROCOMPUTING, 2022, 483 : 148 - 159
[7] Dual Stream Relation Learning Network for Image-Text Retrieval
Wu, Dongqing
Li, Huihui
Gu, Cang
Guo, Lei
Liu, Hang
IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 1551 - 1565
[8] Cross-modal Image-Text Retrieval with Multitask Learning
Luo, Junyu
Shen, Ying
Ao, Xiang
Zhao, Zhou
Yang, Min
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2309 - 2312
[9] Learning to Embed Semantic Similarity for Joint Image-Text Retrieval
Malali, Noam
Keller, Yosi
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 10252 - 10260
[10] Multi-level similarity learning for image-text retrieval
Li, Wen-Hui
Yang, Song
Wang, Yan
Song, Dan
Li, Xuan-Ya
INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (01)

← 1 2 3 4 5 →