Review of unlabeled image-text cross-modal retrieval based on real-valued features

被引：0

作者：

Zhang, Li ^{[1
]}

Chen, Kang ^{[1
]}

Sun, Guanghui ^{[2
]}

机构：

[1] Faculty of Computing, Harbin Institute of Technology, Harbin,150001, China

[2] School of Astronautics, Harbin Institute of Technology, Harbin,150001, China

来源：

Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology | 2024年 / 56卷 / 09期

关键词：

Feature extraction;

D O I：

10.11918/202404027

中图分类号：

学科分类号：

摘要：

In order to investigate the current development status and key issues in the field of cross-modal retrieval based on real-valued features for unlabeled datasets (hereinafter referred to as cross-modal retrieval), this paper conducts an analysis and summary of the existing literatures. Cross-modal retrieval refers to the retrieval of samples from one modality that are relevant to a given query from another modality. Firstly, using a time complexity-based classification approach, existing cross-modal retrieval methods are categorized into feature-based methods and score-based methods. Secondly, the research status of these two categories of methods is described, and the main issues in the current stage for each category are analyzed and discussed. Furthermore, two mainstream datasets and commonly used evaluation metrics for cross-modal retrieval are introduced, and the performance of the two categories of methods on public datasets is compared and analyzed. Finally, key issues to be addressed in the field of cross-modal retrieval are summarized. The research indicates that although significant progress has been made in existing cross-modal retrieval methods, there are still key issues that urgently need to be addressed. These key issues represent important directions for future development in the field of cross-modal retrieval. © 2024 Harbin Institute of Technology. All rights reserved.

引用

页码：1 / 16

共 50 条

[21] Learning Hierarchical Semantic Correspondences for Cross-Modal Image-Text Retrieval
Zeng, Sheng
Liu, Changhong
Zhou, Jun
Chen, Yong
Jiang, Aiwen
Li, Hanxi
PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 239 - 248
[22] An Enhanced Feature Extraction Framework for Cross-Modal Image-Text Retrieval
Zhang, Jinzhi
Wang, Luyao
Zheng, Fuzhong
Wang, Xu
Zhang, Haisu
REMOTE SENSING, 2024, 16 (12)
[23] RICH: A rapid method for image-text cross-modal hash retrieval
Li, Bo
Yao, Dan
Li, Zhixin
DISPLAYS, 2023, 79
[24] Cross-modal fabric image-text retrieval based on convolutional neural network and TinyBERT
Xiang, Jun
Zhang, Ning
Pan, Ruru
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (21) : 59725 - 59746
[25] Hierarchical modal interaction balance cross-modal hashing for unsupervised image-text retrieval
Zhang J.
Lin Z.
Jiang X.
Li M.
Wang C.
Multimedia Tools and Applications, 2024, 83 (42) : 90487 - 90509
[26] SMAN: Stacked Multimodal Attention Network for Cross-Modal Image-Text Retrieval
Ji, Zhong
Wang, Haoran
Han, Jungong
Pang, Yanwei
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (02) : 1086 - 1097
[27] Cross-modal Prominent Fragments Enhancement Aligning Network for Image-text Retrieval
Zhang, Yang
Zhou, Yue
Yang, Zonghao
Chen, Ao
2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024, 2024,
[28] DEEP RANK CROSS-MODAL HASHING WITH SEMANTIC CONSISTENT FOR IMAGE-TEXT RETRIEVAL
Liu, Xiaoqing
Zeng, Huanqiang
Shi, Yifan
Zhu, Jianqing
Ma, Kai-Kuang
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2022, 2022-May : 4828 - 4832
[29] Object-Aware Query Perturbation for Cross-Modal Image-Text Retrieval
Sogi, Naoya
Shibata, Takashi
Terao, Makoto
COMPUTER VISION - ECCV 2024, PT LXXIX, 2025, 15137 : 447 - 464
[30] Visual Contextual Semantic Reasoning for Cross-Modal Drone Image-Text Retrieval
Huang, Jinghao
Chen, Yaxiong
Xiong, Shengwu
Lu, Xiaoqiang
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62

← 1 2 3 4 5 →