Embracing Diversity: Interpretable Zero-shot classification beyond one vector per class

被引:0
|
作者
Moayeri, Mazda [1 ]
Rabbat, Michael [2 ]
Ibrahim, Mark [2 ]
Bouchacourt, Diane [2 ]
机构
[1] Univ Maryland, College Pk, MD 20742 USA
[2] FAIR Meta, New York, NY USA
关键词
Bias; Fairness; Vision Language Models (VLMs); Zero-shot; Classification;
D O I
10.1145/3630106.3659039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vision-language models enable open-world classification of objects without the need for any retraining. While this zero-shot paradigm marks a significant advance, even today's best models exhibit skewed performance when objects are dissimilar from their typical depiction. Real world objects such as pears appear in a variety of forms - from diced to whole, on a table or in a bowl - yet standard VLM classifiers map all instances of a class to a single vector based on the class label. We argue that to represent this rich diversity within a class, zero-shot classification should move beyond a single vector. We propose a method to encode and account for diversity within a class using inferred attributes, still in the zero-shot setting without retraining. We find our method consistently outperforms standard zero-shot classification over a large suite of datasets encompassing hierarchies, diverse object states, and real-world geographic diversity, as well finer-grained datasets where intra-class diversity may be less prevalent. Importantly, our method is inherently interpretable, offering faithful explanations for each inference to facilitate model debugging and enhance transparency. We also find our method scales efficiently to a large number of attributes to account for diversity-leading to more accurate predictions for atypical instances. Finally, we characterize a principled trade-off between overall and worst class accuracy, which can be tuned via a hyperparameter of our method. We hope this work spurs further research into the promise of zero-shot classification beyond a single class vector for capturing diversity in the world, and building transparent AI systems without compromising performance.
引用
收藏
页码:2302 / 2321
页数:20
相关论文
共 50 条
  • [41] Zero-shot image classification based on factor space
    Guan, Shijie
    Guan, Qixue
    Yin, Anqi
    International Journal of Web Engineering and Technology, 2021, 16 (01) : 1 - 29
  • [42] Connectionist temporal classification loss for vector quantized variational autoencoder in zero-shot voice conversion
    Kang, Xiao
    Huang, Hao
    Hu, Ying
    Huang, Zhihua
    DIGITAL SIGNAL PROCESSING, 2021, 116
  • [43] Class-Incremental Generalized Zero-Shot Learning
    Sun, Zhenfeng
    Feng, Rui
    Fu, Yanwei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (24) : 38233 - 38247
  • [44] ZERO-SHOT LEARNING WITH FEW SEEN CLASS SAMPLES
    Huo, Yuqi
    Guan, Jiechao
    Zhang, Jianhong
    Zhang, Manli
    Wen, Ji-Rong
    Lu, Zhiwu
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1336 - 1341
  • [45] Enhanced VAEGAN: a zero-shot image classification method
    Bo Ding
    Yufei Fan
    Yongjun He
    Jing Zhao
    Applied Intelligence, 2023, 53 : 9235 - 9246
  • [46] Combining ontology and reinforcement learning for zero-shot classification
    Liu, Bin
    Yao, Li
    Ding, Zheyuan
    Xu, Junyi
    Wu, Junfeng
    KNOWLEDGE-BASED SYSTEMS, 2018, 144 : 42 - 50
  • [47] Zero-Shot Question Classification Using Synthetic Samples
    Fu, Hao
    Yuan, Caixia
    Wang, Xiaojie
    Sang, Zhijie
    Hu, Shuo
    Shi, Yuanyuan
    PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2018, : 714 - 718
  • [48] Zero-Shot Classification with Discriminative Semantic Representation Learning
    Ye, Meng
    Guo, Yuhong
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5103 - 5111
  • [49] Learning Discriminative Latent Attributes for Zero-Shot Classification
    Jiang, Huajie
    Wang, Ruiping
    Shan, Shiguang
    Yang, Yi
    Chen, Xilin
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4233 - 4242
  • [50] Extreme Zero-Shot Learning for Extreme Text Classification
    Xiong, Yuanhao
    Chang, Wei-Cheng
    Hsieh, Cho-Jui
    Yu, Hsiang-Fu
    Dhillon, Inderjit
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5455 - 5468