Deep multi-view representation learning for social images

被引:11
|
作者
Huang, Feiran [1 ]
Zhang, Xiaoming [2 ]
Zhao, Zhonghua [3 ]
Li, Zhoujun [1 ]
He, Yueying [3 ]
机构
[1] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China
[2] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100191, Peoples R China
[3] Coordinat Ctr China, Natl Comp Network Emergency Response Tech Team, Beijing 100029, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-view learning; Image embedding; Representation learning; Stacked autoencoder;
D O I
10.1016/j.asoc.2018.08.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-view representation learning for social images has recently made remarkable achievements in many tasks, such as cross-view classification and cross-modal retrieval. Since social images usually contain link information besides the multi-modal contents (e.g., text description, and visual content), simply employing the data content may result in sub-optimal multi-view representation of the social images. In this paper, we propose a Deep Multi-View Embedding Model (DMVEM) to learn joint embeddings for the three views including the visual content, the associated text descriptions, and their relations. To effectively encode the link information, a weighted relation network is built based on the linkages between social images, which is then embedded into a low dimensional vector space using the Skip-Gram model. The learned vector is regarded as the third view besides the visual content and text description. To learn a joint representation from the three views, a deep learning model with three-branch nonlinear neural network is proposed. A three-view bi-directional loss function is used to capture the correlation between the three views. The stacked autoencoder is adopted to preserve the self-structure and reconstructability of the learned representation for each view. Comprehensive experiments are conducted in the tasks of image-to-text, text-to-image, and image-to-image searches. Compared to the state-of-the-art multi-view embedding methods, our approach achieves significant improvement of performance. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:106 / 118
页数:13
相关论文
共 50 条
  • [1] On Deep Multi-View Representation Learning
    Wang, Weiran
    Arora, Raman
    Livescu, Karen
    Bilmes, Jeff
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 1083 - 1092
  • [2] DEEP MULTI-VIEW ROBUST REPRESENTATION LEARNING
    Jiao, Zhenyu
    Xu, Chao
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2851 - 2855
  • [3] Multi-View Representation Learning With Deep Gaussian Processes
    Sun, Shiliang
    Dong, Wenbo
    Liu, Qiuyang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) : 4453 - 4468
  • [4] Progressive Deep Multi-View Comprehensive Representation Learning
    Xu, Cai
    Zhao, Wei
    Zhao, Jinglong
    Guan, Ziyu
    Yang, Yaming
    Chen, Long
    Song, Xiangyu
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10557 - 10565
  • [5] Manifold representation of multi-view images
    Zhang, Haopeng
    Jiang, Zhiguo
    Journal of Computational Information Systems, 2014, 10 (11): : 4867 - 4874
  • [6] Multi-view representation learning for multi-view action recognition
    Hao, Tong
    Wu, Dan
    Wang, Qian
    Sun, Jin-Sheng
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 48 : 453 - 460
  • [7] Unsupervised representation learning based on the deep multi-view ensemble learning
    Maryam Koohzadi
    Nasrollah Moghadam Charkari
    Foad Ghaderi
    Applied Intelligence, 2020, 50 : 562 - 581
  • [8] Unsupervised representation learning based on the deep multi-view ensemble learning
    Koohzadi, Maryam
    Charkari, Nasrollah Moghadam
    Ghaderi, Foad
    APPLIED INTELLIGENCE, 2020, 50 (02) : 562 - 581
  • [9] Deep Spectral Representation Learning From Multi-View Data
    Huang, Zhenyu
    Zhou, Joey Tianyi
    Zhu, Hongyuan
    Zhang, Changqing
    Lv, Jiancheng
    Peng, Xi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 5352 - 5362
  • [10] MERL: Multi-View Edge Representation Learning in Social Networks
    Lai, Yi-Yu
    Neville, Jennifer
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 675 - 684