Cross-Modal Correlation Learning with Deep Convolutional Architecture

被引:0
|
作者
Hua, Yan [1 ]
Tian, Hu [2 ]
Cai, Anni [3 ]
Shi, Ping [1 ]
机构
[1] Commun Univ China, Beijing, Peoples R China
[2] Fujitsu Res & Dev Ctr, Beijing, Peoples R China
[3] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
关键词
Deep architecture; Convolution; Correlation learning; Large margin; Cross-modal retrieval;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the explosive growth of online multi-media data, methodologies of retrieving documents from heterogeneous modalities are indispensable to facilitate information acquisition in real applications. Most of existing research efforts are focused on building correlation learning models on hand-crafted features for visual and textual modalities. However, they lack the ability to capture the meaningful patterns from complicated visual modality, and are not able to identify the true correlation between modalities during feature learning process. In this paper, we propose a novel cross-modal correlation learning method with well-designed deep convolutional network to learn representations from visual modality. A cross-modal correlation layer with a linear projection is added on the top of the network by maximizing semantic consistency with large margin principle. All the parameters are jointly optimized with stochastic gradient descent. With the deep architecture, our model is able to disentangle the complex visual information, and learn the semantically consistent patterns in a layer-by-layer fashion. Experimental results on widely used NUS-WIDE dataset show that our model outperforms state-of-the-art correlation learning methods built on 6 hand-crafted visual features for image-text retrieval.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Deep Semantic Correlation with Adversarial Learning for Cross-Modal Retrieval
    Hua, Yan
    Du, Jianhe
    PROCEEDINGS OF 2019 IEEE 9TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2019), 2019, : 252 - 255
  • [2] Deep Cross-Modal Correlation Learning for Audio and Lyrics in Music Retrieval
    Yu, Yi
    Tang, Suhua
    Raposo, Francisco
    Chen, Lei
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2019, 15 (01)
  • [3] IMPROVING CROSS-MODAL CORRELATION LEARNING WITH HYPERLINKS
    Wang, Shuhui
    Wu, Yiling
    Huang, Qingming
    2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2015,
  • [4] Cross-Modal Retrieval Using Deep Learning
    Malik, Shaily
    Bhardwaj, Nikhil
    Bhardwaj, Rahul
    Kumar, Saurabh
    PROCEEDINGS OF THIRD DOCTORAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE, DOSCI 2022, 2023, 479 : 725 - 734
  • [5] Deep canonical correlation analysis with progressive and hypergraph learning for cross-modal retrieval
    Shao, Jie
    Wang, Leiquan
    Zhao, Zhicheng
    Su, Fei
    Cai, Anni
    NEUROCOMPUTING, 2016, 214 : 618 - 628
  • [6] Deep Semantic Correlation Learning based Hashing for Multimedia Cross-Modal Retrieval
    Gong, Xiaolong
    Huang, Linpeng
    Wang, Fuwei
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 117 - 126
  • [7] Incomplete Cross-Modal Retrieval with Deep Correlation Transfer
    Shi, Dan
    Zhu, Lei
    Li, Jingjing
    Dong, Guohua
    Zhang, Huaxiang
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (05)
  • [8] Hybrid and hierarchical fusion networks: a deep cross-modal learning architecture for action recognition
    Sunder Ali Khowaja
    Seok-Lyong Lee
    Neural Computing and Applications, 2020, 32 : 10423 - 10434
  • [9] Hybrid and hierarchical fusion networks: a deep cross-modal learning architecture for action recognition
    Khowaja, Sunder Ali
    Lee, Seok-Lyong
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (14): : 10423 - 10434
  • [10] TOWARDS SKETCH-BASED IMAGE RETRIEVAL WITH DEEP CROSS-MODAL CORRELATION LEARNING
    Huang, Fei
    Jin, Cheng
    Zhang, Yuejie
    Zhang, Tao
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 907 - 912