Multimodal Ensembling for Zero-Shot Image Classification

被引：0

作者：

Hickmon, Javon ^{[1
]}

机构：

[1] Univ Washington, Dept Comp Sci, Seattle, WA 98195 USA

来源：

THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21 | 2024年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Artificial intelligence has made significant progress in image classification, an essential task for machine perception to achieve human-level image understanding. Despite recent advances in vision-language fields, multimodal image classification is still challenging, particularly for the following two reasons. First, models with low capacity often suffer from underfitting and thus underperform on fine-grained image classification. Second, it is important to ensure high-quality data with rich cross-modal representations of each class, which is often difficult to generate. Here, we utilize ensemble learning to reduce the impact of these issues on pre-trained models. We aim to create a meta-model that combines the predictions of multiple open-vocabulary multimodal models trained on different data to create more robust and accurate predictions. By utilizing ensemble learning and multimodal machine learning, we will achieve higher prediction accuracies without any additional training or fine-tuning, meaning that this method is completely zero-shot.

引用

页码：23747 / 23749

页数：3

共 50 条

[31] Zero-shot Image-to-Image Translation
Parmar, Gaurav
Singh, Krishna Kumar
Zhang, Richard
Li, Yijun
Lu, Jingwan
Zhu, Jun-Yan
PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,
[32] Zero-Shot Turkish Text Classification
Birim, Ahmet
Erden, Mustafa
Arslan, Levent M.
29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
[33] Latent Embeddings for Zero-shot Classification
Xian, Yongqin
Akata, Zeynep
Sharma, Gaurav
Nguyen, Quynh
Hein, Matthias
Schiele, Bernt
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 69 - 77
[34] Triple discriminator generative adversarial network for zero-shot image classification
Ji, Zhong
Yan, Jiangtao
Wang, Qiang
Pang, Yanwei
Li, Xuelong
SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (02)
[35] Zero-shot image classification via Visual–Semantic Feature Decoupling
Xin Sun
Yu Tian
Haojie Li
Multimedia Systems, 2024, 30
[36] Triple discriminator generative adversarial network for zero-shot image classification
Zhong Ji
Jiangtao Yan
Qiang Wang
Yanwei Pang
Xuelong Li
Science China Information Sciences, 2021, 64
[37] Boosting Zero-Shot Image Classification via Pairwise Relationship Learning
Li, Hanhui
Wu, Hefeng
Lin, Shujin
Lin, Liang
Luo, Xiaonan
Izquierdo, Ebroul
COMPUTER VISION - ACCV 2016, PT I, 2017, 10111 : 85 - 99
[38] Embedded Zero-Shot Image Classification Based on Bidirectional Feature Mapping
Sun, Huadong
Zhen, Zhibin
Liu, Yinghui
Zhang, Xu
Han, Xiaowei
Zhang, Pengyi
APPLIED SCIENCES-BASEL, 2024, 14 (12):
[39] Deep Ranking for Image Zero-Shot Multi-Label Classification
Ji, Zhong
Cui, Biying
Li, Huihui
Jiang, Yu-Gang
Xiang, Tao
Hospedales, Timothy
Fu, Yanwei
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 6549 - 6560
[40] Online Zero-Shot Classification with CLIP
Qian, Qi
Hu, Juhua
COMPUTER VISION - ECCV 2024, PT LXXVII, 2024, 15135 : 462 - 477

← 1 2 3 4 5 →