Embracing Diversity: Interpretable Zero-shot classification beyond one vector per class

被引:0
|
作者
Moayeri, Mazda [1 ]
Rabbat, Michael [2 ]
Ibrahim, Mark [2 ]
Bouchacourt, Diane [2 ]
机构
[1] Univ Maryland, College Pk, MD 20742 USA
[2] FAIR Meta, New York, NY USA
关键词
Bias; Fairness; Vision Language Models (VLMs); Zero-shot; Classification;
D O I
10.1145/3630106.3659039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vision-language models enable open-world classification of objects without the need for any retraining. While this zero-shot paradigm marks a significant advance, even today's best models exhibit skewed performance when objects are dissimilar from their typical depiction. Real world objects such as pears appear in a variety of forms - from diced to whole, on a table or in a bowl - yet standard VLM classifiers map all instances of a class to a single vector based on the class label. We argue that to represent this rich diversity within a class, zero-shot classification should move beyond a single vector. We propose a method to encode and account for diversity within a class using inferred attributes, still in the zero-shot setting without retraining. We find our method consistently outperforms standard zero-shot classification over a large suite of datasets encompassing hierarchies, diverse object states, and real-world geographic diversity, as well finer-grained datasets where intra-class diversity may be less prevalent. Importantly, our method is inherently interpretable, offering faithful explanations for each inference to facilitate model debugging and enhance transparency. We also find our method scales efficiently to a large number of attributes to account for diversity-leading to more accurate predictions for atypical instances. Finally, we characterize a principled trade-off between overall and worst class accuracy, which can be tuned via a hyperparameter of our method. We hope this work spurs further research into the promise of zero-shot classification beyond a single class vector for capturing diversity in the world, and building transparent AI systems without compromising performance.
引用
收藏
页码:2302 / 2321
页数:20
相关论文
共 50 条
  • [1] ZERO-SHOT AUDIO CLASSIFICATION BASED ON CLASS LABEL EMBEDDINGS
    Xie, Huang
    Virtanen, Tuomas
    2019 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2019, : 264 - 267
  • [2] A Zero-Shot Interpretable Framework for Sentiment Polarity Extraction
    Chaisen, Thanakorn
    Charoenkwan, Phasit
    Kim, Cheong Ghil
    Thiengburanathum, Pree
    IEEE ACCESS, 2024, 12 : 10586 - 10607
  • [3] Trajectory Diversity for Zero-Shot Coordination
    Lupu, Andrei
    Cui, Brandon
    Hu, Hengyuan
    Foerster, Jakob
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [4] Semantic Diversity Learning for Zero-Shot Multi-label Classification
    Ben-Cohen, Avi
    Zamir, Nadav
    Ben Baruch, Emanuel
    Friedman, Itamar
    Zelnik-Manor, Lihi
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 620 - 630
  • [5] Zero-Shot Turkish Text Classification
    Birim, Ahmet
    Erden, Mustafa
    Arslan, Levent M.
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [6] Latent Embeddings for Zero-shot Classification
    Xian, Yongqin
    Akata, Zeynep
    Sharma, Gaurav
    Nguyen, Quynh
    Hein, Matthias
    Schiele, Bernt
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 69 - 77
  • [7] Online Zero-Shot Classification with CLIP
    Qian, Qi
    Hu, Juhua
    COMPUTER VISION - ECCV 2024, PT LXXVII, 2024, 15135 : 462 - 477
  • [8] Zero-Shot Classification Based on Word Vector Enhancement and Distance Metric Learning
    Zhang, Ji
    Chen, Yu
    Zhai, Yongjie
    IEEE ACCESS, 2020, 8 (08): : 102292 - 102302
  • [9] Class knowledge overlay to visual feature learning for zero-shot image classification
    Xie, Cheng
    Zeng, Ting
    Xiang, Hongxin
    Li, Keqin
    Yang, Yun
    Liu, Qing
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 207
  • [10] Max-Margin Zero-Shot Learning for Multi-class Classification
    Li, Xin
    Guo, Yuhong
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 626 - 634