Review of unlabeled image-text cross-modal retrieval based on real-valued features

被引：0

作者：

Zhang, Li ^{[1
]}

Chen, Kang ^{[1
]}

Sun, Guanghui ^{[2
]}

机构：

[1] Faculty of Computing, Harbin Institute of Technology, Harbin,150001, China

[2] School of Astronautics, Harbin Institute of Technology, Harbin,150001, China

来源：

Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology | 2024年 / 56卷 / 09期

关键词：

Feature extraction;

D O I：

10.11918/202404027

中图分类号：

学科分类号：

摘要：

In order to investigate the current development status and key issues in the field of cross-modal retrieval based on real-valued features for unlabeled datasets (hereinafter referred to as cross-modal retrieval), this paper conducts an analysis and summary of the existing literatures. Cross-modal retrieval refers to the retrieval of samples from one modality that are relevant to a given query from another modality. Firstly, using a time complexity-based classification approach, existing cross-modal retrieval methods are categorized into feature-based methods and score-based methods. Secondly, the research status of these two categories of methods is described, and the main issues in the current stage for each category are analyzed and discussed. Furthermore, two mainstream datasets and commonly used evaluation metrics for cross-modal retrieval are introduced, and the performance of the two categories of methods on public datasets is compared and analyzed. Finally, key issues to be addressed in the field of cross-modal retrieval are summarized. The research indicates that although significant progress has been made in existing cross-modal retrieval methods, there are still key issues that urgently need to be addressed. These key issues represent important directions for future development in the field of cross-modal retrieval. © 2024 Harbin Institute of Technology. All rights reserved.

引用

页码：1 / 16

共 50 条

[31] Fine-grained Feature Assisted Cross-modal Image-text Retrieval
Bu, Chaofei
Liu, Xueliang
Huang, Zhen
Su, Yuling
Tu, Junfeng
Hong, Richang
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XI, 2025, 15041 : 306 - 320
[32] An Efficient Cross-Modal Privacy-Preserving Image-Text Retrieval Scheme
Zhang, Kejun
Xu, Shaofei
Song, Yutuo
Xu, Yuwei
Li, Pengcheng
Yang, Xiang
Zou, Bing
Wang, Wenbin
SYMMETRY-BASEL, 2024, 16 (08):
[33] Improving Cross-Modal Image-Text Retrieval With Teacher-Student Learning
Liu, Junhao
Yang, Min
Li, Chengming
Xu, Ruifeng
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (08) : 3242 - 3253
[34] DEEP RANK CROSS-MODAL HASHING WITH SEMANTIC CONSISTENT FOR IMAGE-TEXT RETRIEVAL
Liu, Xiaoqing
Zeng, Huanqiang
Shi, Yifan
Zhu, Jianqing
Ma, Kai-Kuang
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4828 - 4832
[35] Continual learning for cross-modal image-text retrieval based on domain-selective attention
Yang, Rui
Wang, Shuang
Gu, Yu
Wang, Jihui
Sun, Yingzhi
Zhang, Huan
Liao, Yu
Jiao, Licheng
PATTERN RECOGNITION, 2024, 149
[36] Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval
Huang, Hailang
Nie, Zhijie
Wang, Ziqiao
Shang, Ziyu
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 18298 - 18306
[37] Adaptive Cross-Modal Embeddings for Image-Text Alignment
Wehrmann, Pinatas
Kolling, Camila
Barros, Rodrigo C.
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12313 - 12320
[38] Multi-view visual semantic embedding for cross-modal image-text retrieval
Li, Zheng
Guo, Caili
Wang, Xin
Zhang, Hao
Hu, Lin
PATTERN RECOGNITION, 2025, 159
[39] Image-Text Cross-Modal Retrieval via Modality-Specific Feature Learning
Wang, Jian
He, Yonghao
Kang, Cuicui
Xiang, Shiming
Pan, Chunhong
ICMR'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2015, : 347 - 354
[40] Cross-modal information balance-aware reasoning network for image-text retrieval
Qin, Xueyang
Li, Lishuang
Hao, Fei
Pang, Guangyao
Wang, Zehao
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 120

← 1 2 3 4 5 →