Classify social image by integrating multi-modal content

被引:0
|
作者
Xiaoming Zhang
Xu Zhang
Xiong Li
Zhoujun Li
Senzhang Wang
机构
[1] Beihang University,Beijing Key Laboratory of Network Technology
[2] National Computer Network Emergency Response Technical Team of China,State Key Laboratory of Software Development Environment
[3] Beihang University,College of Computer Science and Technology
[4] Nanjing University of Aeronautics and Astronautics,undefined
来源
关键词
Image classification; Multi-modal classification; Social image analysis;
D O I
暂无
中图分类号
学科分类号
摘要
There is a growing volume of social images with the development of social networks and digital cameras. Usually, these images are annotated with textual tags besides the visual content. It is quite urgent to automatically organize and manage this large number of social images. Image classification is the basic task of these applications and has attracted great research efforts. Though there are many researches on image classification, it is of considerable challenge to integrate the multi-modal content of social images simultaneously for classification, since the textual content and visual content are represented in two heterogeneous feature spaces. In this paper, we proposed a multi-modal learning method to integrate multi-modal features through their correlation seamlessly. Specifically, we learn two linear classification modules for the two types of features, and then they are integrated by the l2 normalization method via a joint model. Each classier is normalized with l2,1 to reduce the effect of the noisy features by selecting a subset of more important features. With the joint model, the classification based on visual features can be reinforced by the classification based on textual features, and vice verse. Then, the test image is classified based on both the textual features and visual features by combing the results of the two classifiers. Experiments conducted on real-world social image datasets demonstrate the superiority of our proposed method compared with the representative baselines.
引用
收藏
页码:7469 / 7485
页数:16
相关论文
共 50 条
  • [1] Classify social image by integrating multi-modal content
    Zhang, Xiaoming
    Zhang, Xu
    Li, Xiong
    Li, Zhoujun
    Wang, Senzhang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (06) : 7469 - 7485
  • [2] Multi-modal Learning for Social Image Classification
    Liu, Chunyang
    Zhang, Xu
    Li, Xiong
    Li, Rui
    Zhang, Xiaoming
    Chao, Wenhan
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 1174 - 1179
  • [3] Multi-modal kernel ridge regression for social image classification
    Zhang, Xiaoming
    Chao, Wenhan
    Li, Zhoujun
    Liu, Chunyang
    Li, Rui
    APPLIED SOFT COMPUTING, 2018, 67 : 117 - 125
  • [4] Adaptive Remediation with Multi-modal Content
    Tu, Yuwei
    Brinton, Christopher G.
    Lan, Andrew S.
    Chiang, Mung
    ADAPTIVE INSTRUCTIONAL SYSTEMS, AIS 2019, 2019, 11597 : 455 - 468
  • [5] Integrating Multi-Modal Interfaces to Command UAVs
    Monajjemi, Valiallah
    Pourmehr, Shokoofeh
    Sadat, Seyed Abbas
    Zhan, Fei
    Wawerla, Jens
    Mori, Greg
    Vaughan, Richard
    HRI'14: PROCEEDINGS OF THE 2014 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2014, : 106 - 106
  • [6] Semantically Multi-modal Image Synthesis
    Zhu, Zhen
    Xu, Zhiliang
    You, Ansheng
    Bai, Xiang
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5466 - 5475
  • [7] Multi-modal semantic image segmentation
    Pemasiri, Akila
    Kien Nguyen
    Sridharan, Sridha
    Fookes, Clinton
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 202
  • [8] Fusion of Multi-Modal Features for Efficient Content-Based Image Retrieval
    Frigui, Hichem
    Caudill, Joshua
    Ben Abdallah, Ahmed Chamseddine
    2008 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5, 2008, : 1994 - 2000
  • [9] Content-independent embedding scheme for multi-modal medical image watermarking
    Nyeem, Hussain
    Boles, Wageeh
    Boyd, Colin
    BIOMEDICAL ENGINEERING ONLINE, 2015, 14
  • [10] Content-independent embedding scheme for multi-modal medical image watermarking
    Hussain Nyeem
    Wageeh Boles
    Colin Boyd
    BioMedical Engineering OnLine, 14