Cross-modal image-text search via Efficient Discrete Class Alignment Hashing

被引：15

作者：

Wang, Song ^{[1
,2
]}

Zhao, Huan ^{[1
,2
]}

Wang, Yunbo ^{[3
]}

Huang, Jing ^{[1
,2
]}

Li, Keqin ^{[1
,2
,4
]}

机构：

[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Peoples R China

[2] Key Lab Embedded & Network Comp Hunan Prov, Changsha 410082, Peoples R China

[3] Peking Univ, Wangxuan Inst Comp Technol, Beijing 100080, Peoples R China

[4] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA

来源：

INFORMATION PROCESSING & MANAGEMENT | 2022年 / 59卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Class alignment; Cross-modal image-text search; Hash code; Supervised hashing; BINARY-CODES;

D O I：

10.1016/j.ipm.2022.102886

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Hashing has produced enormous potentials in cross-modal image-text search, which learns compact binary codes by exploring the correlations between distinct modalities. However, there still exist some limitations. First, most existing methods neglect the relation between the data characteristics and supervised information. Second, a relaxation strategy results in large quantization errors. Third, constructing large n x n (a.k.a. training size) similarity graphs increases computational load. To address these issues, we propose a novel discrete supervised hashing method, termed Efficient Discrete Class Alignment Hashing (EDCAH), which integrates class alignment and matrix factorization for hashing learning. Specifically, it exploits the semantic consistency of data instances and informative labels to simultaneously learn the hash codes and hash functions. Meanwhile, a discrete optimization strategy is developed to solve the EDCAH, which is beneficial to generate high-quality hash codes. Furthermore, to improve the learning efficiency of EDCAH, we propose a fast and efficient variant dubbed EDCAH-t that utilizes a two-step hashing strategy. Extensive experiments demonstrate the superiority of EDCAH and EDCAH-t in both search accuracy and learning efficiency.

引用

页数：17

共 50 条

[31] Robust and Flexible Discrete Hashing for Cross-Modal Similarity Search
Wang, Di
Wang, Quan
Gao, Xinbo
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (10) : 2703 - 2715
[32] Cross-modal Semantic Interference Suppression for image-text matching
Yao, Tao
Peng, Shouyong
Sun, Yujuan
Sheng, Guorui
Fu, Haiyan
Kong, Xiangwei
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
[33] Cross-modal Semantic Interference Suppression for image-text matching
Yao, Tao
Peng, Shouyong
Sun, Yujuan
Sheng, Guorui
Fu, Haiyan
Kong, Xiangwei
Engineering Applications of Artificial Intelligence, 2024, 133
[34] Cross-modal Graph Matching Network for Image-text Retrieval
Cheng, Yuhao
Zhu, Xiaoguang
Qian, Jiuchao
Wen, Fei
Liu, Peilin
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (04)
[35] Joint feature approach for image-text cross-modal retrieval
Gao, Dihui
Sheng, Lijie
Xu, Xiaodong
Miao, Qiguang
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2024, 51 (04): : 128 - 138
[36] Cross-modal independent matching network for image-text retrieval
Ke, Xiao
Chen, Baitao
Yang, Xiong
Cai, Yuhang
Liu, Hao
Guo, Wenzhong
PATTERN RECOGNITION, 2025, 159
[37] Image-Text Cross-Modal Retrieval with Instance Contrastive Embedding
Zeng, Ruigeng
Ma, Wentao
Wu, Xiaoqian
Liu, Wei
Liu, Jie
ELECTRONICS, 2024, 13 (02)
[38] Image-Text Cross-Modal Retrieval via Modality-Specific Feature Learning
Wang, Jian
He, Yonghao
Kang, Cuicui
Xiang, Shiming
Pan, Chunhong
ICMR'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2015, : 347 - 354
[39] Text-based person search via cross-modal alignment learning
Ke, Xiao
Liu, Hao
Xu, Peirong
Lin, Xinru
Guo, Wenzhong
PATTERN RECOGNITION, 2024, 152
[40] Asymmetric Discrete Cross-Modal Hashing
Luo, Xin
Zhang, Peng-Fei
Wu, Ye
Chen, Zhen-Duo
Huang, Hua-Junjie
Xu, Xin-Shun
ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2018, : 204 - 212

← 1 2 3 4 5 →