Deep cross-modal hashing with multi-task latent space learning

被引:2
|
作者
Wu, Song [1 ]
Yuan, Xiang [1 ]
Xiao, Guoqiang [1 ]
Lew, Michael S. [2 ]
Gao, Xinbo [3 ]
机构
[1] Southwest Univ, Coll Comp & Informat Sci, Chongqing, Peoples R China
[2] Leiden Univ, Liacs Media Lab, Leiden, Netherlands
[3] Chongqing Univ Posts & Telecommun, Coll Comp Sci & Technol, Chongqing, Peoples R China
关键词
Cross-modal retrieval; Deep hashing; Semantic dependency; Knowledge transfer; BINARY-CODES; REPRESENTATION;
D O I
10.1016/j.engappai.2024.108944
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cross-modal Hashing (CMH) retrieval aims to mutually search data from heterogeneous modalities by projecting original modality data into a common hamming space, with the significant advantages of low storage and computing costs. However, CMH remains challenging for multi-label cross-modal datasets. Firstly, preserving content similarity would inevitably be deficient under the representation of short-length binary codes. Secondly, different semantics are treated independently, whereas their co-occurrences are neglected, reducing retrieval quality. Thirdly, the commonly used metric learning objective is ineffective in capturing similarity information at a fine-grained level, leading to the imprecise preservation of such information. Therefore, we propose a Deep Cross-Modal Hashing with Multi-Task Latent Space Learning (DMLSH) framework to tackle these bottlenecks. For a more thorough excavation of distinctive features with diverse characteristics underneath heterogeneous data, our DMLSH is designed to preserve three different types of knowledge. The first is the semantic relevance and co-occurrence with the integration of the attention module and the Long Short-Term Memory (LSTM) layer; The second is the highly precise pairwise correlation considering the quantification of semantic similarity with self-paced optimization; The last is the pairwise similarity information discovered by a self-supervised semantic network from a perspective of probabilistic knowledge transfer. Abundant knowledge from the latent spaces is seamlessly refined and fused into a common Hamming space by a hashing attention mechanism, facilitating the discrimination of hash codes and the elimination of modalities' heterogeneity. Exhaustive experiments demonstrate the state-of-the-art performance of our proposed DMLSH on four mainstream cross-modal retrieval benchmarks.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Deep adversarial multi-label cross-modal hashing algorithm
    Yang, Xiaohan
    Wang, Zhen
    Liu, Wenhao
    Chang, Xinyi
    Wu, Nannan
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2023, 12 (02)
  • [32] Deep medical cross-modal attention hashing
    Yong Zhang
    Weihua Ou
    Yufeng Shi
    Jiaxin Deng
    Xinge You
    Anzhi Wang
    World Wide Web, 2022, 25 : 1519 - 1536
  • [33] Deep Binary Reconstruction for Cross-modal Hashing
    Li, Xuelong
    Hu, Di
    Nie, Feiping
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1398 - 1406
  • [34] Cross-modal hashing with semantic deep embedding
    Yan, Cheng
    Bai, Xiao
    Wang, Shuai
    Zhou, Jun
    Hancock, Edwin R.
    NEUROCOMPUTING, 2019, 337 : 58 - 66
  • [35] Semi-Paired Asymmetric Deep Cross-Modal Hashing Learning
    Wang, Yi
    Shen, Xiaobo
    Tang, Zhenmin
    Zhang, Tao
    Lv, Jianyong
    IEEE ACCESS, 2020, 8 : 113814 - 113825
  • [36] Deep Enhanced-Similarity Attention Cross-modal Hashing Learning
    Ge, Mingyuan
    Li, Yewen
    Ma, Longfei
    Li, Mingyong
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 612 - 616
  • [37] A novel cross-modal hashing algorithm based on multimodal deep learning
    Wen QU
    Daling WANG
    Shi FENG
    Yifei ZHANG
    Ge YU
    ScienceChina(InformationSciences), 2017, 60 (09) : 50 - 63
  • [38] A novel cross-modal hashing algorithm based on multimodal deep learning
    Qu, Wen
    Wang, Daling
    Feng, Shi
    Zhang, Yifei
    Yu, Ge
    SCIENCE CHINA-INFORMATION SCIENCES, 2017, 60 (09)
  • [39] Multi-Task Collaboration for Cross-Modal Generation and Multi-Modal Ophthalmic Diseases Diagnosis
    Yu, Yang
    Zhu, Hongqing
    Qian, Tianwei
    Hou, Tong
    Huang, Bingcang
    IET IMAGE PROCESSING, 2025, 19 (01)
  • [40] Cross-Modal Learning Based on Semantic Correlation and Multi-Task Learning for Text-Video Retrieval
    Wu, Xiaoyu
    Wang, Tiantian
    Wang, Shengjin
    ELECTRONICS, 2020, 9 (12) : 1 - 17