SUPERVISED MULTI-MODAL TOPIC MODEL FOR IMAGE ANNOTATION

被引:0
|
作者
Tran, Thu Hoai [1 ]
Choi, Seungjin [1 ]
机构
[1] POSTECH, Div IT Convergence Engn, Pohang, South Korea
关键词
Image annotation; latent Dirichlet allocation; topic models;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Multi-modal topic models are probabilistic generative models where hidden topics are learned from data of different types. In this paper we present supervised multi-modal latent Dirichlet allocation (smmLDA), where we incorporate class label (global description) into the joint modeling of visual words and caption words (local description), for image annotation task. We derive variational inference algorithm to approximately compute posterior distribution over latent variables. Experiments on a subset of LabelMe dataset demonstrate the useful behavior of our model, compared to existing topic models.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] Markov random field based fusion for supervised and semi-supervised multi-modal image classification
    Xie, Liang
    Pan, Peng
    Lu, Yansheng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (02) : 613 - 634
  • [42] Markov random field based fusion for supervised and semi-supervised multi-modal image classification
    Liang Xie
    Peng Pan
    Yansheng Lu
    Multimedia Tools and Applications, 2015, 74 : 613 - 634
  • [43] Multi-Modal Image Retrieval by Integrating Web Image Annotation, Concept Matching and Fuzzy Ranking Techniques
    Su, Ja-Hwung
    Wang, Bo-Wen
    Hsu, Tien-Yu
    Chou, Chien-Li
    Tseng, Vincent S.
    INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2010, 12 (02) : 136 - 149
  • [44] Knowledge-Based Topic Model for Multi-Modal Social Event Analysis
    Xue, Feng
    Hong, Richang
    He, Xiangnan
    Wang, Jianwei
    Qian, Shengsheng
    Xu, Changsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (08) : 2098 - 2110
  • [45] Enhanced Topic Modeling with Multi-modal Representation Learning
    Zhang, Duoyi
    Wang, Yue
    Abul Bashar, Md
    Nayak, Richi
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT I, 2023, 13935 : 393 - 404
  • [46] Extracting a Background Image by a Multi-modal Scene Background Model
    Maddalena, Lucia
    Petrosino, Alfredo
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 143 - 148
  • [47] A multi-modal dental dataset for semi-supervised deep learning image segmentation
    Wang, Yaqi
    Ye, Fan
    Chen, Yifei
    Wang, Chengkai
    Wu, Chengyu
    Xu, Feng
    Ma, Zhean
    Liu, Yi
    Zhang, Yifan
    Cao, Mingguo
    Chen, Xiaodiao
    SCIENTIFIC DATA, 2025, 12 (01)
  • [48] Semantically Multi-modal Image Synthesis
    Zhu, Zhen
    Xu, Zhiliang
    You, Ansheng
    Bai, Xiang
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5466 - 5475
  • [49] Multi-modal semantic image segmentation
    Pemasiri, Akila
    Kien Nguyen
    Sridharan, Sridha
    Fookes, Clinton
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 202
  • [50] LATENT TOPIC MODEL FOR IMAGE ANNOTATION BY MODELING TOPIC CORRELATION
    Xu, Xing
    Shimada, Atsushi
    Taniguchi, Rin-ichiro
    2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2013), 2013,