Review of Recent Deep Learning Based Methods for Image-Text Retrieval

被引:5
|
作者
Chen, Jianan [1 ]
Zhang, Lu [1 ]
Bai, Cong [2 ]
Kpalma, Kidiyo [1 ]
机构
[1] Univ Rennes, INSA Rennes, CNRS, UMR 6164,IETR, F-35000 Rennes, France
[2] Zhejiang Univ Technol, Coll Comp Sci, Hangzhou, Peoples R China
关键词
D O I
10.1109/MIPR49039.2020.00042
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Cross-modal retrieval has drawn much attention in recent years due to the diversity and the quantity of information data that exploded with the popularity of mobile devices and social media. Extracting relevant information efficiently from large-scale multi-modal data is becoming a crucial problem of information retrieval. Cross-modal retrieval aims to retrieve relevant information across different modalities. In this paper, we highlight key points of recent cross-modal retrieval approaches based on deep-learning, especially in the image-text retrieval context, and classify them into four categories according to different embedding methods. Evaluations of state-of-the-art cross-modal retrieval methods on two benchmark datasets are shown at the end of this paper.
引用
收藏
页码:171 / 176
页数:6
相关论文
共 50 条
  • [1] Dissecting Deep Metric Learning Losses for Image-Text Retrieval
    Xuan, Hong
    Chen, Xi
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2163 - 2172
  • [2] Compositional Learning of Image-Text Query for Image Retrieval
    Anwaar, Muhammad Umer
    Labintcev, Egor
    Kleinsteuber, Martin
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1139 - 1148
  • [3] Visual context learning based on textual knowledge for image-text retrieval
    Qin, Yuzhuo
    Gu, Xiaodong
    Tan, Zhenshan
    NEURAL NETWORKS, 2022, 152 : 434 - 449
  • [4] Joint Image-text Representation Learning for Fashion Retrieval
    Yan, Cairong
    Li, Yu
    Wan, Yongquan
    Zhang, Zhaohui
    ICMLC 2020: 2020 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2018, : 412 - 417
  • [5] The Research on Image-text Fusion Emotion Recognition Based on Deep Learning
    Wang, Xuyang
    Zhang, Xin
    PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON COMPUTER AND MULTIMEDIA TECHNOLOGY, ICCMT 2024, 2024, : 124 - 128
  • [6] Image-text bidirectional learning network based cross-modal retrieval
    Li, Zhuoyi
    Lu, Huibin
    Fu, Hao
    Gu, Guanghua
    NEUROCOMPUTING, 2022, 483 : 148 - 159
  • [7] Dual Stream Relation Learning Network for Image-Text Retrieval
    Wu, Dongqing
    Li, Huihui
    Gu, Cang
    Guo, Lei
    Liu, Hang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 1551 - 1565
  • [8] Cross-modal Image-Text Retrieval with Multitask Learning
    Luo, Junyu
    Shen, Ying
    Ao, Xiang
    Zhao, Zhou
    Yang, Min
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2309 - 2312
  • [9] Learning to Embed Semantic Similarity for Joint Image-Text Retrieval
    Malali, Noam
    Keller, Yosi
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 10252 - 10260
  • [10] Multi-level similarity learning for image-text retrieval
    Li, Wen-Hui
    Yang, Song
    Wang, Yan
    Song, Dan
    Li, Xuan-Ya
    INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (01)