Review of Recent Deep Learning Based Methods for Image-Text Retrieval

被引：5

作者：

Chen, Jianan ^{[1
]}

Zhang, Lu ^{[1
]}

Bai, Cong ^{[2
]}

Kpalma, Kidiyo ^{[1
]}

机构：

[1] Univ Rennes, INSA Rennes, CNRS, UMR 6164,IETR, F-35000 Rennes, France

[2] Zhejiang Univ Technol, Coll Comp Sci, Hangzhou, Peoples R China

来源：

THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2020) | 2020年

关键词：

D O I：

10.1109/MIPR49039.2020.00042

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Cross-modal retrieval has drawn much attention in recent years due to the diversity and the quantity of information data that exploded with the popularity of mobile devices and social media. Extracting relevant information efficiently from large-scale multi-modal data is becoming a crucial problem of information retrieval. Cross-modal retrieval aims to retrieve relevant information across different modalities. In this paper, we highlight key points of recent cross-modal retrieval approaches based on deep-learning, especially in the image-text retrieval context, and classify them into four categories according to different embedding methods. Evaluations of state-of-the-art cross-modal retrieval methods on two benchmark datasets are shown at the end of this paper.

引用

页码：171 / 176

页数：6

共 50 条

[31] Integrating listwise ranking into pairwise-based image-text retrieval
Li, Zheng
Guo, Caili
Wang, Xin
Zhang, Hao
Wang, Yanjun
KNOWLEDGE-BASED SYSTEMS, 2024, 287
[32] Multiscale Salient Alignment Learning for Remote-Sensing Image-Text Retrieval
Chen, Yaxiong
Huang, Jinghao
Li, Xiaoyu
Xiong, Shengwu
Lu, Xiaoqiang
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 13
[33] Learning and Integrating Multi-Level Matching Features for Image-Text Retrieval
Lan, Hong
Zhang, Pufen
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 374 - 378
[34] An image-text multimodal fusion of deep learning for detecting insulator defects
Feng, Bo
Xia, Xiaofei
Zhang, Longfei
Zhang, Wei
Xu, Qi
Liu, Peng
Chen, Shaonan
INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2025,
[35] Deep Cross-Modal Projection Learning for Image-Text Matching
Zhang, Ying
Lu, Huchuan
COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 707 - 723
[36] DEEP RANK CROSS-MODAL HASHING WITH SEMANTIC CONSISTENT FOR IMAGE-TEXT RETRIEVAL
Liu, Xiaoqing
Zeng, Huanqiang
Shi, Yifan
Zhu, Jianqing
Ma, Kai-Kuang
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2022, 2022-May : 4828 - 4832
[37] Continual learning for cross-modal image-text retrieval based on domain-selective attention
Yang, Rui
Wang, Shuang
Gu, Yu
Wang, Jihui
Sun, Yingzhi
Zhang, Huan
Liao, Yu
Jiao, Licheng
PATTERN RECOGNITION, 2024, 149
[38] Dynamic Modality Interaction Modeling for Image-Text Retrieval
Qu, Leigang
Liu, Meng
Wu, Jianlong
Gao, Zan
Nie, Liqiang
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1104 - 1113
[39] DEEP RANK CROSS-MODAL HASHING WITH SEMANTIC CONSISTENT FOR IMAGE-TEXT RETRIEVAL
Liu, Xiaoqing
Zeng, Huanqiang
Shi, Yifan
Zhu, Jianqing
Ma, Kai-Kuang
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4828 - 4832
[40] Review of unlabeled image-text cross-modal retrieval based on real-valued features
Zhang, Li
Chen, Kang
Sun, Guanghui
Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2024, 56 (09): : 1 - 16

← 1 2 3 4 5 →