Review of unlabeled image-text cross-modal retrieval based on real-valued features

被引：0

作者：

Zhang, Li ^{[1
]}

Chen, Kang ^{[1
]}

Sun, Guanghui ^{[2
]}

机构：

[1] Faculty of Computing, Harbin Institute of Technology, Harbin,150001, China

[2] School of Astronautics, Harbin Institute of Technology, Harbin,150001, China

来源：

Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology | 2024年 / 56卷 / 09期

关键词：

Feature extraction;

D O I：

10.11918/202404027

中图分类号：

学科分类号：

摘要：

In order to investigate the current development status and key issues in the field of cross-modal retrieval based on real-valued features for unlabeled datasets (hereinafter referred to as cross-modal retrieval), this paper conducts an analysis and summary of the existing literatures. Cross-modal retrieval refers to the retrieval of samples from one modality that are relevant to a given query from another modality. Firstly, using a time complexity-based classification approach, existing cross-modal retrieval methods are categorized into feature-based methods and score-based methods. Secondly, the research status of these two categories of methods is described, and the main issues in the current stage for each category are analyzed and discussed. Furthermore, two mainstream datasets and commonly used evaluation metrics for cross-modal retrieval are introduced, and the performance of the two categories of methods on public datasets is compared and analyzed. Finally, key issues to be addressed in the field of cross-modal retrieval are summarized. The research indicates that although significant progress has been made in existing cross-modal retrieval methods, there are still key issues that urgently need to be addressed. These key issues represent important directions for future development in the field of cross-modal retrieval. © 2024 Harbin Institute of Technology. All rights reserved.

引用

页码：1 / 16

共 50 条

[41] Unsupervised deep hashing with multiple similarity preservation for cross-modal image-text retrieval
Xiong, Siyu
Pan, Lili
Ma, Xueqiang
Hu, Qinghua
Beckman, Eric
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (10) : 4423 - 4434
[42] IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval
Chen, Hui
Ding, Guiguang
Liu, Xudong
Lin, Zijia
Liu, Ji
Han, Jungong
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 12652 - 12660
[43] A Deep Semantic Alignment Network for the Cross-Modal Image-Text Retrieval in Remote Sensing
Cheng, Qimin
Zhou, Yuzhuo
Fu, Peng
Xu, Yuan
Zhang, Liang
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 4284 - 4297
[44] Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval
Wang, Sijin
Wang, Ruiping
Yao, Ziwei
Shan, Shiguang
Chen, Xilin
2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1497 - 1506
[45] Strong and Weak Prompt Engineering for Remote Sensing Image-Text Cross-Modal Retrieval
Sun, Tianci
Zheng, Chengyu
Li, Xiu
Nie, Jie
Gao, Yanli
Huang, Lei
Wei, Zhiqiang
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 6968 - 6980
[46] A TRANSFORMER-BASED CROSS-MODAL IMAGE-TEXT RETRIEVAL METHOD USING FEATURE DECOUPLING AND RECONSTRUCTION
Zhang, Huan
Sun, Yingzhi
Liao, Yu
Xu, SiYuan
Yang, Rui
Wang, Shuang
Hou, Biao
Jiao, Licheng
2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 1796 - 1799
[47] TECMH: Transformer-Based Cross-Modal Hashing For Fine-Grained Image-Text Retrieval
Li, Qiqi
Ma, Longfei
Jiang, Zheng
Li, Mingyong
Jin, Bo
CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 3713 - 3728
[48] A TEXTURE AND SALIENCY ENHANCED IMAGE LEARNING METHOD FOR CROSS-MODAL REMOTE SENSING IMAGE-TEXT RETRIEVAL
Yang, Rui
Zhang, Di
Guo, YanHe
Wang, Shuang
IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 4895 - 4898
[49] Deep Cross-Modal Projection Learning for Image-Text Matching
Zhang, Ying
Lu, Huchuan
COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 707 - 723
[50] MULTI-SCALE INTERACTIVE TRANSFORMER FOR REMOTE SENSING CROSS-MODAL IMAGE-TEXT RETRIEVAL
Wang, Yijing
Ma, Jingjing
Li, Mingteng
Tang, Xu
Han, Xiao
Jiao, Licheng
2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 839 - 842

← 1 2 3 4 5 →