Review of Recent Deep Learning Based Methods for Image-Text Retrieval

被引:5
|
作者
Chen, Jianan [1 ]
Zhang, Lu [1 ]
Bai, Cong [2 ]
Kpalma, Kidiyo [1 ]
机构
[1] Univ Rennes, INSA Rennes, CNRS, UMR 6164,IETR, F-35000 Rennes, France
[2] Zhejiang Univ Technol, Coll Comp Sci, Hangzhou, Peoples R China
关键词
D O I
10.1109/MIPR49039.2020.00042
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Cross-modal retrieval has drawn much attention in recent years due to the diversity and the quantity of information data that exploded with the popularity of mobile devices and social media. Extracting relevant information efficiently from large-scale multi-modal data is becoming a crucial problem of information retrieval. Cross-modal retrieval aims to retrieve relevant information across different modalities. In this paper, we highlight key points of recent cross-modal retrieval approaches based on deep-learning, especially in the image-text retrieval context, and classify them into four categories according to different embedding methods. Evaluations of state-of-the-art cross-modal retrieval methods on two benchmark datasets are shown at the end of this paper.
引用
收藏
页码:171 / 176
页数:6
相关论文
共 50 条
  • [31] Integrating listwise ranking into pairwise-based image-text retrieval
    Li, Zheng
    Guo, Caili
    Wang, Xin
    Zhang, Hao
    Wang, Yanjun
    KNOWLEDGE-BASED SYSTEMS, 2024, 287
  • [32] Multiscale Salient Alignment Learning for Remote-Sensing Image-Text Retrieval
    Chen, Yaxiong
    Huang, Jinghao
    Li, Xiaoyu
    Xiong, Shengwu
    Lu, Xiaoqiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 13
  • [33] Learning and Integrating Multi-Level Matching Features for Image-Text Retrieval
    Lan, Hong
    Zhang, Pufen
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 374 - 378
  • [34] An image-text multimodal fusion of deep learning for detecting insulator defects
    Feng, Bo
    Xia, Xiaofei
    Zhang, Longfei
    Zhang, Wei
    Xu, Qi
    Liu, Peng
    Chen, Shaonan
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2025,
  • [35] Deep Cross-Modal Projection Learning for Image-Text Matching
    Zhang, Ying
    Lu, Huchuan
    COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 707 - 723
  • [36] DEEP RANK CROSS-MODAL HASHING WITH SEMANTIC CONSISTENT FOR IMAGE-TEXT RETRIEVAL
    Liu, Xiaoqing
    Zeng, Huanqiang
    Shi, Yifan
    Zhu, Jianqing
    Ma, Kai-Kuang
    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2022, 2022-May : 4828 - 4832
  • [37] Continual learning for cross-modal image-text retrieval based on domain-selective attention
    Yang, Rui
    Wang, Shuang
    Gu, Yu
    Wang, Jihui
    Sun, Yingzhi
    Zhang, Huan
    Liao, Yu
    Jiao, Licheng
    PATTERN RECOGNITION, 2024, 149
  • [38] Dynamic Modality Interaction Modeling for Image-Text Retrieval
    Qu, Leigang
    Liu, Meng
    Wu, Jianlong
    Gao, Zan
    Nie, Liqiang
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1104 - 1113
  • [39] DEEP RANK CROSS-MODAL HASHING WITH SEMANTIC CONSISTENT FOR IMAGE-TEXT RETRIEVAL
    Liu, Xiaoqing
    Zeng, Huanqiang
    Shi, Yifan
    Zhu, Jianqing
    Ma, Kai-Kuang
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4828 - 4832
  • [40] Review of unlabeled image-text cross-modal retrieval based on real-valued features
    Zhang, Li
    Chen, Kang
    Sun, Guanghui
    Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2024, 56 (09): : 1 - 16