HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval

被引:36
|
作者
Zhang, Chengyuan [1 ]
Song, Jiayu [2 ]
Zhu, Xiaofeng [3 ]
Zhu, Lei [4 ]
Zhang, Shichao [2 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Hunan, Peoples R China
[2] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Hunan, Peoples R China
[3] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610054, Sichuan, Peoples R China
[4] Hunan Agr Univ, Coll Informat & Intelligence, Changsha 410128, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; deep learning; intra-modal semantic correlation; hybrid cross-modal similarity;
D O I
10.1145/3412847
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The purpose of cross-modal retrieval is to find the relationship between different modal samples and to retrieve other modal samples with similar semantics by using a certain modal sample. As the data of different modalities presents heterogeneous low-level feature and semantic-related high-level features, the main problem of cross-modal retrieval is how to measure the similarity between different modalities. In this article, we present a novel cross-modal retrieval method, named Hybrid Cross-Modal Similarity Learning model (HCMSL for short). It aims to capture sufficient semantic information from both labeled and unlabeled cross-modal pairs and intra-modal pairs with same classification label. Specifically, a coupled deep fully connected networks are used to map cross-modal feature representations into a common subspace. Weight-sharing strategy is utilized between two branches of networks to diminish cross-modal heterogeneity. Furthermore, two Siamese CNN models are employed to learn intra-modal similarity from samples of same modality. Comprehensive experiments on real datasets clearly demonstrate that our proposed technique achieves substantial improvements over the state-of-the-art cross-modal retrieval techniques.
引用
收藏
页数:22
相关论文
共 50 条
  • [31] Category Alignment Adversarial Learning for Cross-Modal Retrieval
    He, Shiyuan
    Wang, Weiyang
    Wang, Zheng
    Xu, Xing
    Yang, Yang
    Wang, Xiaoming
    Shen, Heng Tao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4527 - 4538
  • [32] Adversarial cross-modal retrieval based on dictionary learning
    Shang, Fei
    Zhang, Huaxiang
    Zhu, Lei
    Sun, Jiande
    NEUROCOMPUTING, 2019, 355 : 93 - 104
  • [33] Soft Contrastive Cross-Modal Retrieval
    Song, Jiayu
    Hu, Yuxuan
    Zhu, Lei
    Zhang, Chengyuan
    Zhang, Jian
    Zhang, Shichao
    APPLIED SCIENCES-BASEL, 2024, 14 (05):
  • [34] Probabilistic Embeddings for Cross-Modal Retrieval
    Chun, Sanghyuk
    Oh, Seong Joon
    de Rezende, Rafael Sampaio
    Kalantidis, Yannis
    Larlus, Diane
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8411 - 8420
  • [35] Cross-modal Retrieval with Correspondence Autoencoder
    Feng, Fangxiang
    Wang, Xiaojie
    Li, Ruifan
    PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 7 - 16
  • [36] The cross-modal double flash illusion depends on featural similarity between cross-modal inducers
    Warrick Roseboom
    Takahiro Kawabe
    Shin'ya Nishida
    Scientific Reports, 3
  • [37] The cross-modal double flash illusion depends on featural similarity between cross-modal inducers
    Roseboom, Warrick
    Kawabe, Takahiro
    Nishida, Shin'ya
    SCIENTIFIC REPORTS, 2013, 3
  • [38] Deep Multimodal Transfer Learning for Cross-Modal Retrieval
    Zhen, Liangli
    Hu, Peng
    Peng, Xi
    Goh, Rick Siow Mong
    Zhou, Joey Tianyi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (02) : 798 - 810
  • [39] Heterogeneous Metric Learning for Cross-Modal Multimedia Retrieval
    Deng, Jun
    Du, Liang
    Shen, Yi-Dong
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2013, PT I, 2013, 8180 : 43 - 56
  • [40] Cross-Modal Retrieval for CPSS Data
    Zhong, Fangming
    Wang, Guangze
    Chen, Zhikui
    Xia, Feng
    Min, Geyong
    IEEE ACCESS, 2020, 8 : 16689 - 16701