HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval

被引：36

作者：

Zhang, Chengyuan ^{[1
]}

Song, Jiayu ^{[2
]}

Zhu, Xiaofeng ^{[3
]}

Zhu, Lei ^{[4
]}

Zhang, Shichao ^{[2
]}

机构：

[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Hunan, Peoples R China

[2] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Hunan, Peoples R China

[3] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610054, Sichuan, Peoples R China

[4] Hunan Agr Univ, Coll Informat & Intelligence, Changsha 410128, Hunan, Peoples R China

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2021年 / 17卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Cross-modal retrieval; deep learning; intra-modal semantic correlation; hybrid cross-modal similarity;

D O I：

10.1145/3412847

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The purpose of cross-modal retrieval is to find the relationship between different modal samples and to retrieve other modal samples with similar semantics by using a certain modal sample. As the data of different modalities presents heterogeneous low-level feature and semantic-related high-level features, the main problem of cross-modal retrieval is how to measure the similarity between different modalities. In this article, we present a novel cross-modal retrieval method, named Hybrid Cross-Modal Similarity Learning model (HCMSL for short). It aims to capture sufficient semantic information from both labeled and unlabeled cross-modal pairs and intra-modal pairs with same classification label. Specifically, a coupled deep fully connected networks are used to map cross-modal feature representations into a common subspace. Weight-sharing strategy is utilized between two branches of networks to diminish cross-modal heterogeneity. Furthermore, two Siamese CNN models are employed to learn intra-modal similarity from samples of same modality. Comprehensive experiments on real datasets clearly demonstrate that our proposed technique achieves substantial improvements over the state-of-the-art cross-modal retrieval techniques.

引用

页数：22

共 50 条

[1] Hybrid representation learning for cross-modal retrieval
Cao, Wenming
Lin, Qiubin
He, Zhihai
He, Zhiquan
NEUROCOMPUTING, 2019, 345 : 45 - 57
[2] Hashing for Cross-Modal Similarity Retrieval
Liu, Yao
Yuan, Yanhong
Huang, Qiaoli
Huang, Zhixing
2015 11TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2015, : 1 - 8
[3] Deep Hashing Similarity Learning for Cross-Modal Retrieval
Ma, Ying
Wang, Meng
Lu, Guangyun
Sun, Yajun
IEEE ACCESS, 2024, 12 : 8609 - 8618
[4] Online Asymmetric Similarity Learning for Cross-Modal Retrieval
Wu, Yiling
Wang, Shuhui
Huang, Qingming
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3984 - 3993
[5] Continual learning in cross-modal retrieval
Wang, Kai
Herranz, Luis
van de Weijer, Joost
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3623 - 3633
[6] Similarity drifting problem in cross-modal retrieval
Zheng Q.
Diao X.
Wang Y.
Cao J.
Liu Y.
Qin W.
Liu, Yi (albertliu20th@163.com); Liu, Yi (albertliu20th@163.com), 1600, National University of Defense Technology (43): : 99 - 106
[7] Learning DALTS for cross-modal retrieval
Yu, Zheng
Wang, Wenmin
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2019, 4 (01) : 9 - 16
[8] Sequential Learning for Cross-modal Retrieval
Song, Ge
Tan, Xiaoyang
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4531 - 4539
[9] DRSL: Deep Relational Similarity Learning for Cross-modal Retrieval
Wang, Xu
Hu, Peng
Zhen, Liangli
Peng, Dezhong
INFORMATION SCIENCES, 2021, 546 : 298 - 311
[10] Adversarial Cross-Modal Retrieval
Wang, Bokun
Yang, Yang
Xu, Xing
Hanjalic, Alan
Shen, Heng Tao
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 154 - 162

← 1 2 3 4 5 →