Cross-Modal Correlation Learning with Deep Convolutional Architecture

被引：0

作者：

Hua, Yan ^{[1
]}

Tian, Hu ^{[2
]}

Cai, Anni ^{[3
]}

Shi, Ping ^{[1
]}

机构：

[1] Commun Univ China, Beijing, Peoples R China

[2] Fujitsu Res & Dev Ctr, Beijing, Peoples R China

[3] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

来源：

2015 VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP) | 2015年

关键词：

Deep architecture; Convolution; Correlation learning; Large margin; Cross-modal retrieval;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the explosive growth of online multi-media data, methodologies of retrieving documents from heterogeneous modalities are indispensable to facilitate information acquisition in real applications. Most of existing research efforts are focused on building correlation learning models on hand-crafted features for visual and textual modalities. However, they lack the ability to capture the meaningful patterns from complicated visual modality, and are not able to identify the true correlation between modalities during feature learning process. In this paper, we propose a novel cross-modal correlation learning method with well-designed deep convolutional network to learn representations from visual modality. A cross-modal correlation layer with a linear projection is added on the top of the network by maximizing semantic consistency with large margin principle. All the parameters are jointly optimized with stochastic gradient descent. With the deep architecture, our model is able to disentangle the complex visual information, and learn the semantically consistent patterns in a layer-by-layer fashion. Experimental results on widely used NUS-WIDE dataset show that our model outperforms state-of-the-art correlation learning methods built on 6 hand-crafted visual features for image-text retrieval.

引用

页数：4

共 50 条

[21] Multimedia Feature Mapping and Correlation Learning for Cross-Modal Retrieval
Yuan, Xu
Zhong, Hua
Chen, Zhikui
Zhong, Fangming
Hu, Yueming
INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2018, 10 (03) : 29 - 45
[22] Show and Tell in the Loop: Cross-Modal Circular Correlation Learning
Peng, Yuxin
Qi, Jinwei
IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (06) : 1538 - 1550
[23] Cross-Modal Correlation Learning by Adaptive Hierarchical Semantic Aggregation
Hua, Yan
Wang, Shuhui
Liu, Siyuan
Cai, Anni
Huang, Qingming
IEEE TRANSACTIONS ON MULTIMEDIA, 2016, 18 (06) : 1201 - 1216
[24] HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval
Zhang, Chengyuan
Song, Jiayu
Zhu, Xiaofeng
Zhu, Lei
Zhang, Shichao
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)
[25] CLASSIFICATION OF BREAST LESIONS USING CROSS-MODAL DEEP LEARNING
Hadad, Omer
Bakalo, Ran
Ben-Ari, Rami
Hashoul, Sharbell
Amit, Guy
2017 IEEE 14TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2017), 2017, : 109 - 112
[26] Learning Cross-Modal Deep Representations for Robust Pedestrian Detection
Xu, Dan
Ouyang, Wanli
Ricci, Elisa
Wang, Xiaogang
Sebe, Nicu
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4236 - 4244
[27] Deep Evidential Learning with Noisy Correspondence for Cross-modal Retrieval
Qin, Yang
Peng, Dezhong
Peng, Xi
Wang, Xu
Hu, Peng
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4948 - 4956
[28] Cross-Modal Retrieval using Random Multimodal Deep Learning
Somasekar, Hemanth
Naveen, Kavya
JOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES, 2019, 14 (02): : 185 - 200
[29] DRSL: Deep Relational Similarity Learning for Cross-modal Retrieval
Wang, Xu
Hu, Peng
Zhen, Liangli
Peng, Dezhong
INFORMATION SCIENCES, 2021, 546 : 298 - 311
[30] Deep Cross-Modal Hashing With Ranking Learning for Noisy Labels
Shu, Zhenqiu
Bai, Yibing
Yong, Kailing
Yu, Zhengtao
IEEE TRANSACTIONS ON BIG DATA, 2025, 11 (02) : 553 - 565

← 1 2 3 4 5 →