Prototype-Based Semantic Segmentation

被引:11
|
作者
Zhou, Tianfei [1 ]
Wang, Wenguan [2 ]
机构
[1] Beijing Inst Technol, Dept Comp Sci, Beijing 100811, Peoples R China
[2] Zhejiang Univ, CCAI, Hangzhou 310027, Peoples R China
基金
中国国家自然科学基金;
关键词
Prototypes; Measurement; Semantic segmentation; Image segmentation; Vectors; Semantics; Transformers; prototype; nonparametric classification; online clustering; REPRESENTATION;
D O I
10.1109/TPAMI.2024.3387116
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning based semantic segmentation solutions have yielded compelling results over the preceding decade. They encompass diverse network architectures (FCN based or attention based), along with various mask decoding schemes (parametric softmax based or pixel-query based). Despite the divergence, they can be grouped within a unified framework by interpreting the softmax weights or query vectors as learnable class prototypes. In light of this prototype view, we reveal inherent limitations within the parametric segmentation regime, and accordingly develop a nonparametric alternative based on non-learnable prototypes. In contrast to previous approaches that entail the learning of a single weight/query vector per class in a fully parametric manner, our approach represents each class as a set of non-learnable prototypes, relying solely upon the mean features of training pixels within that class. The pixel-wise prediction is thus achieved by nonparametric nearest prototype retrieving. This allows our model to directly shape the pixel embedding space by optimizing the arrangement between embedded pixels and anchored prototypes. It is able to accommodate an arbitrary number of classes with a constant number of learnable parameters. Through empirical evaluation with FCN based and Transformer based segmentation models (i.e., HRNet, Swin, SegFormer, Mask2Former) and backbones (i.e., ResNet, HRNet, Swin, MiT), our nonparametric framework shows superior performance on standard segmentation datasets (i.e., ADE20 K, Cityscapes, COCO-Stuff), as well as in large-vocabulary semantic segmentation scenarios. We expect that this study will provoke a rethink of the current de facto semantic segmentation model design.
引用
收藏
页码:6858 / 6872
页数:15
相关论文
共 50 条
  • [31] Prototype-Based Modeling for Facial Expression Analysis
    Dahmane, Mohamed
    Meunier, Jean
    IEEE TRANSACTIONS ON MULTIMEDIA, 2014, 16 (06) : 1574 - 1584
  • [32] Development of a prototype-based measure of relational boredom
    Harasymchuk, Cheryl
    Fehr, Beverley
    PERSONAL RELATIONSHIPS, 2012, 19 (01) : 162 - 181
  • [33] COLLABORATIVE CLUSTERING USING PROTOTYPE-BASED TECHNIQUES
    Ghassany, Mohamad
    Grozavu, Nistor
    Bennani, Younes
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2012, 11 (03)
  • [34] Data warehouse testing: A prototype-based methodology
    Golfarelli, Matteo
    Rizzi, Stefano
    INFORMATION AND SOFTWARE TECHNOLOGY, 2011, 53 (11) : 1183 - 1198
  • [35] Rethinking Semantic Segmentation: A Prototype View
    Zhou, Tianfei
    Wang, Wenguan
    Konukoglu, Ender
    Van Gool, Luc
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 2572 - 2583
  • [36] Learning Prototype-based Classifiers by Margin Maximization
    Wakou, Chiharu
    Kusunoki, Yoshifumi
    Tatsumi, Keiji
    2017 JOINT 17TH WORLD CONGRESS OF INTERNATIONAL FUZZY SYSTEMS ASSOCIATION AND 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (IFSA-SCIS), 2017,
  • [37] PROTOTYPE-BASED ASSESSMENT OF LAYPEOPLES VIEWS OF LOVE
    FEHR, B
    PERSONAL RELATIONSHIPS, 1994, 1 (04) : 309 - 331
  • [38] Hyperparameter learning in probabilistic prototype-based models
    Schneider, Petra
    Biehl, Michael
    Hammer, Barbara
    NEUROCOMPUTING, 2010, 73 (7-9) : 1117 - 1124
  • [39] Complexity reduction in efficient prototype-based classification
    Ferri, FJ
    Sánchez, JS
    Pla, F
    PATTERN RECOGNITION, 2006, 39 (02) : 161 - 163
  • [40] Prototype-Based Interpretable Graph Neural Networks
    Ragno A.
    Rosa B.L.
    Capobianco R.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (04): : 1486 - 1495