Classify social image by integrating multi-modal content

被引:0
|
作者
Xiaoming Zhang
Xu Zhang
Xiong Li
Zhoujun Li
Senzhang Wang
机构
[1] Beihang University,Beijing Key Laboratory of Network Technology
[2] National Computer Network Emergency Response Technical Team of China,State Key Laboratory of Software Development Environment
[3] Beihang University,College of Computer Science and Technology
[4] Nanjing University of Aeronautics and Astronautics,undefined
来源
Multimedia Tools and Applications | 2018年 / 77卷
关键词
Image classification; Multi-modal classification; Social image analysis;
D O I
暂无
中图分类号
学科分类号
摘要
There is a growing volume of social images with the development of social networks and digital cameras. Usually, these images are annotated with textual tags besides the visual content. It is quite urgent to automatically organize and manage this large number of social images. Image classification is the basic task of these applications and has attracted great research efforts. Though there are many researches on image classification, it is of considerable challenge to integrate the multi-modal content of social images simultaneously for classification, since the textual content and visual content are represented in two heterogeneous feature spaces. In this paper, we proposed a multi-modal learning method to integrate multi-modal features through their correlation seamlessly. Specifically, we learn two linear classification modules for the two types of features, and then they are integrated by the l2 normalization method via a joint model. Each classier is normalized with l2,1 to reduce the effect of the noisy features by selecting a subset of more important features. With the joint model, the classification based on visual features can be reinforced by the classification based on textual features, and vice verse. Then, the test image is classified based on both the textual features and visual features by combing the results of the two classifiers. Experiments conducted on real-world social image datasets demonstrate the superiority of our proposed method compared with the representative baselines.
引用
收藏
页码:7469 / 7485
页数:16
相关论文
共 50 条
  • [11] Multi-Modal Image Retrieval by Integrating Web Image Annotation, Concept Matching and Fuzzy Ranking Techniques
    Su, Ja-Hwung
    Wang, Bo-Wen
    Hsu, Tien-Yu
    Chou, Chien-Li
    Tseng, Vincent S.
    INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2010, 12 (02) : 136 - 149
  • [12] Cross-modal attention for multi-modal image registration
    Song, Xinrui
    Chao, Hanqing
    Xu, Xuanang
    Guo, Hengtao
    Xu, Sheng
    Turkbey, Baris
    Wood, Bradford J.
    Sanford, Thomas
    Wang, Ge
    Yan, Pingkun
    MEDICAL IMAGE ANALYSIS, 2022, 82
  • [13] Integrating multi-modal imaging in radiation treatments for glioblastoma
    Breen, William G.
    Aryal, Madhava P.
    Cao, Yue
    Kim, Michelle M.
    NEURO-ONCOLOGY, 2024, 26 : S17 - S25
  • [14] MULTI-MODAL IMAGE STITCHING WITH NONLINEAR OPTIMIZATION
    Saha, Arindam
    Maity, Soumyadip
    Bhowmick, Brojeshwar
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1987 - 1991
  • [15] Multi-Modal Deformable Medical Image Registration
    Fookes, Clinton
    Sridharan, Sridha
    ICSPCS: 2ND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, PROCEEDINGS, 2008, : 661 - 669
  • [16] A variational approach to multi-modal image matching
    Chefd'Hotel, C
    Hermosillo, G
    Faugeras, O
    IEEE WORKSHOP ON VARIATIONAL AND LEVEL SET METHODS IN COMPUTER VISION, PROCEEDINGS, 2001, : 21 - 28
  • [17] Multi-Modal Image Captioning for the Visually Impaired
    Ahsan, Hiba
    Bhalla, Nikita
    Bhatt, Daivat
    Shah, Kaivankumar
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 53 - 60
  • [18] Multi-modal Image Fusion with KNN Matting
    Zhang, Xia
    Lin, Hui
    Kang, Xudong
    Li, Shutao
    PATTERN RECOGNITION (CCPR 2014), PT II, 2014, 484 : 89 - 96
  • [19] MixBERT for Multi-modal Matching in Image Advertising
    Yu, Tan
    Li, Xiaokang
    Xie, Jianwen
    Yin, Ruiyang
    Xu, Qing
    Li, Ping
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3597 - 3602
  • [20] A Multi-modal SPM Model for Image Classification
    Zheng, Peng
    Zhao, Zhong-Qiu
    Gao, Jun
    INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2017, PT III, 2017, 10363 : 525 - 535