HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval

被引:36
|
作者
Zhang, Chengyuan [1 ]
Song, Jiayu [2 ]
Zhu, Xiaofeng [3 ]
Zhu, Lei [4 ]
Zhang, Shichao [2 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Hunan, Peoples R China
[2] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Hunan, Peoples R China
[3] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610054, Sichuan, Peoples R China
[4] Hunan Agr Univ, Coll Informat & Intelligence, Changsha 410128, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; deep learning; intra-modal semantic correlation; hybrid cross-modal similarity;
D O I
10.1145/3412847
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The purpose of cross-modal retrieval is to find the relationship between different modal samples and to retrieve other modal samples with similar semantics by using a certain modal sample. As the data of different modalities presents heterogeneous low-level feature and semantic-related high-level features, the main problem of cross-modal retrieval is how to measure the similarity between different modalities. In this article, we present a novel cross-modal retrieval method, named Hybrid Cross-Modal Similarity Learning model (HCMSL for short). It aims to capture sufficient semantic information from both labeled and unlabeled cross-modal pairs and intra-modal pairs with same classification label. Specifically, a coupled deep fully connected networks are used to map cross-modal feature representations into a common subspace. Weight-sharing strategy is utilized between two branches of networks to diminish cross-modal heterogeneity. Furthermore, two Siamese CNN models are employed to learn intra-modal similarity from samples of same modality. Comprehensive experiments on real datasets clearly demonstrate that our proposed technique achieves substantial improvements over the state-of-the-art cross-modal retrieval techniques.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] A Graph Model for Cross-modal Retrieval
    Wang, Shixun
    Pan, Peng
    Lu, Yansheng
    PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON MULTIMEDIA TECHNOLOGY (ICMT-13), 2013, 84 : 1090 - 1097
  • [42] Cross-modal retrieval with dual optimization
    Xu, Qingzhen
    Liu, Shuang
    Qiao, Han
    Li, Miao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (05) : 7141 - 7157
  • [43] Geometric Matching for Cross-Modal Retrieval
    Wang, Zheng
    Gao, Zhenwei
    Yang, Yang
    Wang, Guoqing
    Jiao, Chengbo
    Shen, Heng Tao
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 13
  • [44] CROSS-MODAL RETRIEVAL WITH NOISY LABELS
    Mandal, Devraj
    Biswas, Soma
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2326 - 2330
  • [45] Learning discriminative common alignments for cross-modal retrieval
    Liu, Hui
    Chen, Xiao-Ping
    Hong, Rui
    Zhou, Yan
    Wan, Tian-Cai
    Bai, Tai-Li
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (02)
  • [46] Universal Weighting Metric Learning for Cross-Modal Retrieval
    Wei, Jiwei
    Yang, Yang
    Xu, Xing
    Zhu, Xiaofeng
    Shen, Heng Tao
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 6534 - 6545
  • [47] Learning Relation Alignment for Calibrated Cross-modal Retrieval
    Ren, Shuhuai
    Lin, Junyang
    Zhao, Guangxiang
    Men, Rui
    Yang, An
    Zhou, Jingren
    Sun, Xu
    Yang, Hongxia
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 514 - 524
  • [48] Semantics Disentangling for Cross-Modal Retrieval
    Wang, Zheng
    Xu, Xing
    Wei, Jiwei
    Xie, Ning
    Yang, Yang
    Shen, Heng Tao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 2226 - 2237
  • [49] Adversarial Learning for Cross-Modal Retrieval with Wasserstein Distance
    Cheng, Qingrong
    Zhang, Youcai
    Gu, Xiaodong
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT I, 2019, 11953 : 16 - 29
  • [50] Wasserstein Coupled Graph Learning for Cross-Modal Retrieval
    Wang, Yun
    Zhang, Tong
    Zhang, Xueya
    Cui, Zhen
    Huang, Yuge
    Shen, Pengcheng
    Li, Shaoxin
    Yang, Jian
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1793 - 1802