Weakly Supervised Hashing with Reconstructive Cross-modal Attention

被引：1

作者：

Du, Yongchao ^{[1
]}

Wang, Min ^{[2
]}

Lu, Zhenbo ^{[2
]}

Zhou, Wengang ^{[1
]}

Li, Houqiang ^{[1
]}

机构：

[1] Univ Sci & Technol China, Hefei 230027, Peoples R China

[2] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2023年 / 19卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Weakly supervised hashing; attention; QUANTIZATION;

D O I：

10.1145/3589185

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

On many popular social websites, images are usually associated with some meta-data such as textual tags, which involve semantic information relevant to the image and can be used to supervise the representation learning for image retrieval. However, these user-provided tags are usually polluted by noise, therefore the main challenge lies in mining the potential useful information from those noisy tags. Many previous works simply treat different tags equally to generate supervision, which will inevitably distract the network learning. To this end, we propose a new framework, termed as Weakly Supervised Hashing with Reconstructive Cross-modal Attention (WSHRCA), to learn compact visual-semantic representation with more reliable supervision for retrieval task. Specifically, for each image-tag pair, the weak supervision from tags is refined by cross-modal attention, which takes image feature as query to aggregate the most content-relevant tags. Therefore, tags with relevant content will be more prominent while noisy tags will be suppressed, which provides more accurate supervisory information. To improve the effectiveness of hash learning, the image embedding in WSHRCA is reconstructed from hash code, which is further optimized by cross-modal constraint and explicitly improves hash learning. The experiments on two widely-used datasets demonstrate the effectiveness of our proposed method for weakly-supervised image retrieval. The code is available at https://github.com/duyc168/weakly-supervised-hashing.

引用

页数：19

共 50 条

[11] Supervised Hierarchical Online Hashing for Cross-modal Retrieval
Han, Kai
Liu, Yu
Wei, Rukai
Zhou, Ke
Xu, Jinhui
Long, Kun
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (04)
[12] Correlation Autoencoder Hashing for Supervised Cross-Modal Search
Cao, Yue
Long, Mingsheng
Wang, Jianmin
Zhu, Han
ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, : 197 - 204
[13] Supervised Contrastive Discrete Hashing for cross-modal retrieval
Li, Ze
Yao, Tao
Wang, Lili
Li, Ying
Wang, Gang
KNOWLEDGE-BASED SYSTEMS, 2024, 295
[14] Discriminative Supervised Hashing for Cross-Modal Similarity Search
Yu, Jun
Wu, Xiao-Jun
Kittler, Josef
IMAGE AND VISION COMPUTING, 2019, 89 : 50 - 56
[15] Dual-pathway Attention based Supervised Adversarial Hashing for Cross-modal Retrieval
Wang, Xiaoxiao
Liang, Meiyu
Cao, Xiaowen
Du, Junping
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2021), 2021, : 168 - 171
[16] Supervised Matrix Factorization Hashing for Cross-Modal Retrieval
Tang, Jun
Wang, Ke
Shao, Ling
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (07) : 3157 - 3166
[17] FUSION-SUPERVISED DEEP CROSS-MODAL HASHING
Wang, Li
Zhu, Lei
Yu, En
Sun, Jiande
Zhang, Huaxiang
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 37 - 42
[18] Discriminative correlation hashing for supervised cross-modal retrieval
Lu, Xu
Zhang, Huaxiang
Sun, Jiande
Wang, Zhenhua
Guo, Peilian
Wan, Wenbo
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 65 : 221 - 230
[19] Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval
Zhan, Yu-Wei
Luo, Xin
Wang, Yongxin
Xu, Xin-Shun
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3386 - 3394
[20] Discrete Robust Supervised Hashing for Cross-Modal Retrieval
Yao, Tao
Zhang, Zhiwang
Yan, Lianshan
Yue, Jun
Tian, Qi
IEEE ACCESS, 2019, 7 : 39806 - 39814

← 1 2 3 4 5 →