Deep multi-view representation learning for social images

被引:11
|
作者
Huang, Feiran [1 ]
Zhang, Xiaoming [2 ]
Zhao, Zhonghua [3 ]
Li, Zhoujun [1 ]
He, Yueying [3 ]
机构
[1] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China
[2] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100191, Peoples R China
[3] Coordinat Ctr China, Natl Comp Network Emergency Response Tech Team, Beijing 100029, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-view learning; Image embedding; Representation learning; Stacked autoencoder;
D O I
10.1016/j.asoc.2018.08.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-view representation learning for social images has recently made remarkable achievements in many tasks, such as cross-view classification and cross-modal retrieval. Since social images usually contain link information besides the multi-modal contents (e.g., text description, and visual content), simply employing the data content may result in sub-optimal multi-view representation of the social images. In this paper, we propose a Deep Multi-View Embedding Model (DMVEM) to learn joint embeddings for the three views including the visual content, the associated text descriptions, and their relations. To effectively encode the link information, a weighted relation network is built based on the linkages between social images, which is then embedded into a low dimensional vector space using the Skip-Gram model. The learned vector is regarded as the third view besides the visual content and text description. To learn a joint representation from the three views, a deep learning model with three-branch nonlinear neural network is proposed. A three-view bi-directional loss function is used to capture the correlation between the three views. The stacked autoencoder is adopted to preserve the self-structure and reconstructability of the learned representation for each view. Comprehensive experiments are conducted in the tasks of image-to-text, text-to-image, and image-to-image searches. Compared to the state-of-the-art multi-view embedding methods, our approach achieves significant improvement of performance. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:106 / 118
页数:13
相关论文
共 50 条
  • [31] Unsupervised Multi-View Gaze Representation Learning
    Gideon, John
    Su, Shan
    Stent, Simon
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4997 - 5005
  • [32] Tensorized Multi-view Subspace Representation Learning
    Zhang, Changqing
    Fu, Huazhu
    Wang, Jing
    Li, Wen
    Cao, Xiaochun
    Hu, Qinghua
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (8-9) : 2344 - 2361
  • [33] Collaborative Unsupervised Multi-View Representation Learning
    Zheng, Qinghai
    Zhu, Jihua
    Li, Zhongyu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (07) : 4202 - 4210
  • [34] Security alert: Generalized deep multi-view representation learning for crime forecasting
    Zheng, Ziwan
    Xia, Yu
    Chen, Xiaocong
    Yao, Junwei
    COMPUTATIONAL INTELLIGENCE, 2023, 39 (01) : 4 - 17
  • [35] Multi-View Multi-Instance Learning Based on Joint Sparse Representation and Multi-View Dictionary Learning
    Li, Bing
    Yuan, Chunfeng
    Xiong, Weihua
    Hu, Weiming
    Peng, Houwen
    Ding, Xinmiao
    Maybank, Steve
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2554 - 2560
  • [36] Semantically consistent multi-view representation learning
    Zhou, Yiyang
    Zheng, Qinghai
    Bai, Shunshun
    Zhu, Jihua
    KNOWLEDGE-BASED SYSTEMS, 2023, 278
  • [37] Residual Relaxation for Multi-view Representation Learning
    Wang, Yifei
    Geng, Zhengyang
    Jiang, Feng
    Li, Chuming
    Wang, Yisen
    Yang, Jiansheng
    Lin, Zhouchen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [38] Tensorized Multi-view Subspace Representation Learning
    Changqing Zhang
    Huazhu Fu
    Jing Wang
    Wen Li
    Xiaochun Cao
    Qinghua Hu
    International Journal of Computer Vision, 2020, 128 : 2344 - 2361
  • [39] Multi-View Representation Learning with Manifold Smoothness
    Li, Shu
    Wang, Wei
    Li, Wen-Tao
    Chen, Pan
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8447 - 8454
  • [40] Multi-view Semantic Learning for Data Representation
    Luo, Peng
    Peng, Jinye
    Guan, Ziyu
    Fan, Jianping
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2015, PT I, 2015, 9284 : 367 - 382