SUPERVISED MULTI-MODAL TOPIC MODEL FOR IMAGE ANNOTATION

被引：0

作者：

Tran, Thu Hoai ^{[1
]}

Choi, Seungjin ^{[1
]}

机构：

[1] POSTECH, Div IT Convergence Engn, Pohang, South Korea

来源：

2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2014年

关键词：

Image annotation; latent Dirichlet allocation; topic models;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Multi-modal topic models are probabilistic generative models where hidden topics are learned from data of different types. In this paper we present supervised multi-modal latent Dirichlet allocation (smmLDA), where we incorporate class label (global description) into the joint modeling of visual words and caption words (local description), for image annotation task. We derive variational inference algorithm to approximately compute posterior distribution over latent variables. Experiments on a subset of LabelMe dataset demonstrate the useful behavior of our model, compared to existing topic models.

引用

页数：5

共 50 条

[41] Markov random field based fusion for supervised and semi-supervised multi-modal image classification
Xie, Liang
Pan, Peng
Lu, Yansheng
MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (02) : 613 - 634
[42] Markov random field based fusion for supervised and semi-supervised multi-modal image classification
Liang Xie
Peng Pan
Yansheng Lu
Multimedia Tools and Applications, 2015, 74 : 613 - 634
[43] Multi-Modal Image Retrieval by Integrating Web Image Annotation, Concept Matching and Fuzzy Ranking Techniques
Su, Ja-Hwung
Wang, Bo-Wen
Hsu, Tien-Yu
Chou, Chien-Li
Tseng, Vincent S.
INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2010, 12 (02) : 136 - 149
[44] Knowledge-Based Topic Model for Multi-Modal Social Event Analysis
Xue, Feng
Hong, Richang
He, Xiangnan
Wang, Jianwei
Qian, Shengsheng
Xu, Changsheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (08) : 2098 - 2110
[45] Enhanced Topic Modeling with Multi-modal Representation Learning
Zhang, Duoyi
Wang, Yue
Abul Bashar, Md
Nayak, Richi
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT I, 2023, 13935 : 393 - 404
[46] Extracting a Background Image by a Multi-modal Scene Background Model
Maddalena, Lucia
Petrosino, Alfredo
2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 143 - 148
[47] A multi-modal dental dataset for semi-supervised deep learning image segmentation
Wang, Yaqi
Ye, Fan
Chen, Yifei
Wang, Chengkai
Wu, Chengyu
Xu, Feng
Ma, Zhean
Liu, Yi
Zhang, Yifan
Cao, Mingguo
Chen, Xiaodiao
SCIENTIFIC DATA, 2025, 12 (01)
[48] Semantically Multi-modal Image Synthesis
Zhu, Zhen
Xu, Zhiliang
You, Ansheng
Bai, Xiang
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5466 - 5475
[49] Multi-modal semantic image segmentation
Pemasiri, Akila
Kien Nguyen
Sridharan, Sridha
Fookes, Clinton
COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 202
[50] LATENT TOPIC MODEL FOR IMAGE ANNOTATION BY MODELING TOPIC CORRELATION
Xu, Xing
Shimada, Atsushi
Taniguchi, Rin-ichiro
2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2013), 2013,

← 1 2 3 4 5 →