Distilling knowledge from multiple foundation models for zero-shot image classification

被引:0
|
作者
Yin, Siqi [1 ]
Jiang, Lifan [1 ]
机构
[1] Shandong Univ Sci & Technol, Sch Comp Sci & Technol, Qingdao, Shandong, Peoples R China
来源
PLOS ONE | 2024年 / 19卷 / 09期
关键词
D O I
10.1371/journal.pone.0310730
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Zero-shot image classification enables the recognition of new categories without requiring additional training data, thereby enhancing the model's generalization capability when specific training are unavailable. This paper introduces a zero-shot image classification framework to recognize new categories that are unseen during training by distilling knowledge from foundation models. Specifically, we first employ ChatGPT and DALL-E to synthesize reference images of unseen categories from text prompts. Then, the test image is aligned with text and reference images using CLIP and DINO to calculate the logits. Finally, the predicted logits are aggregated according to their confidence to produce the final prediction. Experiments are conducted on multiple datasets, including MNIST, SVHN, CIFAR-10, CIFAR-100, and TinyImageNet. The results demonstrate that our method can significantly improve classification accuracy compared to previous approaches, achieving AUROC scores of over 96% across all test datasets. Our code is available at https://github.com/1134112149/MICW-ZIC.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Zero-shot Image-to-Image Translation
    Parmar, Gaurav
    Singh, Krishna Kumar
    Zhang, Richard
    Li, Yijun
    Lu, Jingwan
    Zhu, Jun-Yan
    PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,
  • [42] Zero-Shot Turkish Text Classification
    Birim, Ahmet
    Erden, Mustafa
    Arslan, Levent M.
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [43] Latent Embeddings for Zero-shot Classification
    Xian, Yongqin
    Akata, Zeynep
    Sharma, Gaurav
    Nguyen, Quynh
    Hein, Matthias
    Schiele, Bernt
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 69 - 77
  • [44] Triple discriminator generative adversarial network for zero-shot image classification
    Ji, Zhong
    Yan, Jiangtao
    Wang, Qiang
    Pang, Yanwei
    Li, Xuelong
    SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (02)
  • [45] Zero-shot image classification via Visual–Semantic Feature Decoupling
    Xin Sun
    Yu Tian
    Haojie Li
    Multimedia Systems, 2024, 30
  • [46] Triple discriminator generative adversarial network for zero-shot image classification
    Zhong Ji
    Jiangtao Yan
    Qiang Wang
    Yanwei Pang
    Xuelong Li
    Science China Information Sciences, 2021, 64
  • [47] Boosting Zero-Shot Image Classification via Pairwise Relationship Learning
    Li, Hanhui
    Wu, Hefeng
    Lin, Shujin
    Lin, Liang
    Luo, Xiaonan
    Izquierdo, Ebroul
    COMPUTER VISION - ACCV 2016, PT I, 2017, 10111 : 85 - 99
  • [48] Embedded Zero-Shot Image Classification Based on Bidirectional Feature Mapping
    Sun, Huadong
    Zhen, Zhibin
    Liu, Yinghui
    Zhang, Xu
    Han, Xiaowei
    Zhang, Pengyi
    APPLIED SCIENCES-BASEL, 2024, 14 (12):
  • [49] Deep Ranking for Image Zero-Shot Multi-Label Classification
    Ji, Zhong
    Cui, Biying
    Li, Huihui
    Jiang, Yu-Gang
    Xiang, Tao
    Hospedales, Timothy
    Fu, Yanwei
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 6549 - 6560
  • [50] Online Zero-Shot Classification with CLIP
    Qian, Qi
    Hu, Juhua
    COMPUTER VISION - ECCV 2024, PT LXXVII, 2024, 15135 : 462 - 477