HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval

被引:36
|
作者
Zhang, Chengyuan [1 ]
Song, Jiayu [2 ]
Zhu, Xiaofeng [3 ]
Zhu, Lei [4 ]
Zhang, Shichao [2 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Hunan, Peoples R China
[2] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Hunan, Peoples R China
[3] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610054, Sichuan, Peoples R China
[4] Hunan Agr Univ, Coll Informat & Intelligence, Changsha 410128, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; deep learning; intra-modal semantic correlation; hybrid cross-modal similarity;
D O I
10.1145/3412847
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The purpose of cross-modal retrieval is to find the relationship between different modal samples and to retrieve other modal samples with similar semantics by using a certain modal sample. As the data of different modalities presents heterogeneous low-level feature and semantic-related high-level features, the main problem of cross-modal retrieval is how to measure the similarity between different modalities. In this article, we present a novel cross-modal retrieval method, named Hybrid Cross-Modal Similarity Learning model (HCMSL for short). It aims to capture sufficient semantic information from both labeled and unlabeled cross-modal pairs and intra-modal pairs with same classification label. Specifically, a coupled deep fully connected networks are used to map cross-modal feature representations into a common subspace. Weight-sharing strategy is utilized between two branches of networks to diminish cross-modal heterogeneity. Furthermore, two Siamese CNN models are employed to learn intra-modal similarity from samples of same modality. Comprehensive experiments on real datasets clearly demonstrate that our proposed technique achieves substantial improvements over the state-of-the-art cross-modal retrieval techniques.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Hybrid representation learning for cross-modal retrieval
    Cao, Wenming
    Lin, Qiubin
    He, Zhihai
    He, Zhiquan
    NEUROCOMPUTING, 2019, 345 : 45 - 57
  • [2] Hashing for Cross-Modal Similarity Retrieval
    Liu, Yao
    Yuan, Yanhong
    Huang, Qiaoli
    Huang, Zhixing
    2015 11TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2015, : 1 - 8
  • [3] Deep Hashing Similarity Learning for Cross-Modal Retrieval
    Ma, Ying
    Wang, Meng
    Lu, Guangyun
    Sun, Yajun
    IEEE ACCESS, 2024, 12 : 8609 - 8618
  • [4] Online Asymmetric Similarity Learning for Cross-Modal Retrieval
    Wu, Yiling
    Wang, Shuhui
    Huang, Qingming
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3984 - 3993
  • [5] Continual learning in cross-modal retrieval
    Wang, Kai
    Herranz, Luis
    van de Weijer, Joost
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3623 - 3633
  • [6] Similarity drifting problem in cross-modal retrieval
    Zheng Q.
    Diao X.
    Wang Y.
    Cao J.
    Liu Y.
    Qin W.
    Liu, Yi (albertliu20th@163.com); Liu, Yi (albertliu20th@163.com), 1600, National University of Defense Technology (43): : 99 - 106
  • [7] Learning DALTS for cross-modal retrieval
    Yu, Zheng
    Wang, Wenmin
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2019, 4 (01) : 9 - 16
  • [8] Sequential Learning for Cross-modal Retrieval
    Song, Ge
    Tan, Xiaoyang
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4531 - 4539
  • [9] DRSL: Deep Relational Similarity Learning for Cross-modal Retrieval
    Wang, Xu
    Hu, Peng
    Zhen, Liangli
    Peng, Dezhong
    INFORMATION SCIENCES, 2021, 546 : 298 - 311
  • [10] Adversarial Cross-Modal Retrieval
    Wang, Bokun
    Yang, Yang
    Xu, Xing
    Hanjalic, Alan
    Shen, Heng Tao
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 154 - 162