Multimodal Ensembling for Zero-Shot Image Classification

被引:0
|
作者
Hickmon, Javon [1 ]
机构
[1] Univ Washington, Dept Comp Sci, Seattle, WA 98195 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Artificial intelligence has made significant progress in image classification, an essential task for machine perception to achieve human-level image understanding. Despite recent advances in vision-language fields, multimodal image classification is still challenging, particularly for the following two reasons. First, models with low capacity often suffer from underfitting and thus underperform on fine-grained image classification. Second, it is important to ensure high-quality data with rich cross-modal representations of each class, which is often difficult to generate. Here, we utilize ensemble learning to reduce the impact of these issues on pre-trained models. We aim to create a meta-model that combines the predictions of multiple open-vocabulary multimodal models trained on different data to create more robust and accurate predictions. By utilizing ensemble learning and multimodal machine learning, we will achieve higher prediction accuracies without any additional training or fine-tuning, meaning that this method is completely zero-shot.
引用
收藏
页码:23747 / 23749
页数:3
相关论文
共 50 条
  • [31] Zero-shot Image-to-Image Translation
    Parmar, Gaurav
    Singh, Krishna Kumar
    Zhang, Richard
    Li, Yijun
    Lu, Jingwan
    Zhu, Jun-Yan
    PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,
  • [32] Zero-Shot Turkish Text Classification
    Birim, Ahmet
    Erden, Mustafa
    Arslan, Levent M.
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [33] Latent Embeddings for Zero-shot Classification
    Xian, Yongqin
    Akata, Zeynep
    Sharma, Gaurav
    Nguyen, Quynh
    Hein, Matthias
    Schiele, Bernt
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 69 - 77
  • [34] Triple discriminator generative adversarial network for zero-shot image classification
    Ji, Zhong
    Yan, Jiangtao
    Wang, Qiang
    Pang, Yanwei
    Li, Xuelong
    SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (02)
  • [35] Zero-shot image classification via Visual–Semantic Feature Decoupling
    Xin Sun
    Yu Tian
    Haojie Li
    Multimedia Systems, 2024, 30
  • [36] Triple discriminator generative adversarial network for zero-shot image classification
    Zhong Ji
    Jiangtao Yan
    Qiang Wang
    Yanwei Pang
    Xuelong Li
    Science China Information Sciences, 2021, 64
  • [37] Boosting Zero-Shot Image Classification via Pairwise Relationship Learning
    Li, Hanhui
    Wu, Hefeng
    Lin, Shujin
    Lin, Liang
    Luo, Xiaonan
    Izquierdo, Ebroul
    COMPUTER VISION - ACCV 2016, PT I, 2017, 10111 : 85 - 99
  • [38] Embedded Zero-Shot Image Classification Based on Bidirectional Feature Mapping
    Sun, Huadong
    Zhen, Zhibin
    Liu, Yinghui
    Zhang, Xu
    Han, Xiaowei
    Zhang, Pengyi
    APPLIED SCIENCES-BASEL, 2024, 14 (12):
  • [39] Deep Ranking for Image Zero-Shot Multi-Label Classification
    Ji, Zhong
    Cui, Biying
    Li, Huihui
    Jiang, Yu-Gang
    Xiang, Tao
    Hospedales, Timothy
    Fu, Yanwei
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 6549 - 6560
  • [40] Online Zero-Shot Classification with CLIP
    Qian, Qi
    Hu, Juhua
    COMPUTER VISION - ECCV 2024, PT LXXVII, 2024, 15135 : 462 - 477