Deep multi-view representation learning for social images

被引:11
|
作者
Huang, Feiran [1 ]
Zhang, Xiaoming [2 ]
Zhao, Zhonghua [3 ]
Li, Zhoujun [1 ]
He, Yueying [3 ]
机构
[1] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China
[2] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100191, Peoples R China
[3] Coordinat Ctr China, Natl Comp Network Emergency Response Tech Team, Beijing 100029, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-view learning; Image embedding; Representation learning; Stacked autoencoder;
D O I
10.1016/j.asoc.2018.08.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-view representation learning for social images has recently made remarkable achievements in many tasks, such as cross-view classification and cross-modal retrieval. Since social images usually contain link information besides the multi-modal contents (e.g., text description, and visual content), simply employing the data content may result in sub-optimal multi-view representation of the social images. In this paper, we propose a Deep Multi-View Embedding Model (DMVEM) to learn joint embeddings for the three views including the visual content, the associated text descriptions, and their relations. To effectively encode the link information, a weighted relation network is built based on the linkages between social images, which is then embedded into a low dimensional vector space using the Skip-Gram model. The learned vector is regarded as the third view besides the visual content and text description. To learn a joint representation from the three views, a deep learning model with three-branch nonlinear neural network is proposed. A three-view bi-directional loss function is used to capture the correlation between the three views. The stacked autoencoder is adopted to preserve the self-structure and reconstructability of the learned representation for each view. Comprehensive experiments are conducted in the tasks of image-to-text, text-to-image, and image-to-image searches. Compared to the state-of-the-art multi-view embedding methods, our approach achieves significant improvement of performance. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:106 / 118
页数:13
相关论文
共 50 条
  • [41] Contrastive Multi-View Representation Learning on Graphs
    Hassani, Kaveh
    Khasahmadi, Amir Hosein
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [42] Deep Contrastive Multi-View Subspace Clustering With Representation and Cluster Interactive Learning
    Yu, Xuejiao
    Jiang, Yi
    Chao, Guoqing
    Chu, Dianhui
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (01) : 188 - 199
  • [43] A Multi-View Deep Metric Learning approach for Categorical Representation on mixed data
    Li, Qiude
    Ji, Shengfen
    Hu, Sigui
    Yu, Yang
    Chen, Sen
    Xiong, Qingyu
    Zeng, Zhu
    KNOWLEDGE-BASED SYSTEMS, 2023, 260
  • [44] Improving deep learning based segmentation of scars using multi-view images
    Zhou, Jian
    Dai, Yuqing
    Liu, Dongmei
    Zhu, Weifang
    Xiang, Dehui
    Chen, Xinjian
    Shi, Fei
    Xia, Wentao
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 94
  • [45] Learning disentangled user representation with multi-view information fusion on social networks
    Tang, Wenyi
    Hui, Bei
    Tian, Ling
    Luo, Guangchun
    He, Zaobo
    Cai, Zhipeng
    INFORMATION FUSION, 2021, 74 : 77 - 86
  • [46] Segmentation and Counting of Plant Organs Using Deep Learning and Multi-view Images
    Lv, Hui
    Chen, Zhen
    Mo, Yuhang
    Lou, Lu
    Song, Ran
    Doonan, John H.
    ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, 2022, 1409 : 406 - 411
  • [47] Robust Multi-view Representation: A Unified Perspective from Multi-view Learning to Domain Adaption
    Ding, Zhengming
    Shao, Ming
    Fu, Yun
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 5434 - 5440
  • [48] Multi-view representation learning in multi-task scene
    Run-kun Lu
    Jian-wei Liu
    Si-ming Lian
    Xin Zuo
    Neural Computing and Applications, 2020, 32 : 10403 - 10422
  • [49] Multi-view representation learning in multi-task scene
    Lu, Run-kun
    Liu, Jian-wei
    Lian, Si-ming
    Zuo, Xin
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (14): : 10403 - 10422
  • [50] Multi-view Opinion Mining with Deep Learning
    Huang, Ping
    Xie, Xijiong
    Sun, Shiliang
    NEURAL PROCESSING LETTERS, 2019, 50 (02) : 1451 - 1463