Prototype-Based Semantic Segmentation

被引：11

作者：

Zhou, Tianfei ^{[1
]}

Wang, Wenguan ^{[2
]}

机构：

[1] Beijing Inst Technol, Dept Comp Sci, Beijing 100811, Peoples R China

[2] Zhejiang Univ, CCAI, Hangzhou 310027, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Prototypes; Measurement; Semantic segmentation; Image segmentation; Vectors; Semantics; Transformers; prototype; nonparametric classification; online clustering; REPRESENTATION;

D O I：

10.1109/TPAMI.2024.3387116

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep learning based semantic segmentation solutions have yielded compelling results over the preceding decade. They encompass diverse network architectures (FCN based or attention based), along with various mask decoding schemes (parametric softmax based or pixel-query based). Despite the divergence, they can be grouped within a unified framework by interpreting the softmax weights or query vectors as learnable class prototypes. In light of this prototype view, we reveal inherent limitations within the parametric segmentation regime, and accordingly develop a nonparametric alternative based on non-learnable prototypes. In contrast to previous approaches that entail the learning of a single weight/query vector per class in a fully parametric manner, our approach represents each class as a set of non-learnable prototypes, relying solely upon the mean features of training pixels within that class. The pixel-wise prediction is thus achieved by nonparametric nearest prototype retrieving. This allows our model to directly shape the pixel embedding space by optimizing the arrangement between embedded pixels and anchored prototypes. It is able to accommodate an arbitrary number of classes with a constant number of learnable parameters. Through empirical evaluation with FCN based and Transformer based segmentation models (i.e., HRNet, Swin, SegFormer, Mask2Former) and backbones (i.e., ResNet, HRNet, Swin, MiT), our nonparametric framework shows superior performance on standard segmentation datasets (i.e., ADE20 K, Cityscapes, COCO-Stuff), as well as in large-vocabulary semantic segmentation scenarios. We expect that this study will provoke a rethink of the current de facto semantic segmentation model design.

引用

页码：6858 / 6872

页数：15

共 50 条

[31] Prototype-Based Modeling for Facial Expression Analysis
Dahmane, Mohamed
Meunier, Jean
IEEE TRANSACTIONS ON MULTIMEDIA, 2014, 16 (06) : 1574 - 1584
[32] Development of a prototype-based measure of relational boredom
Harasymchuk, Cheryl
Fehr, Beverley
PERSONAL RELATIONSHIPS, 2012, 19 (01) : 162 - 181
[33] COLLABORATIVE CLUSTERING USING PROTOTYPE-BASED TECHNIQUES
Ghassany, Mohamad
Grozavu, Nistor
Bennani, Younes
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2012, 11 (03)
[34] Data warehouse testing: A prototype-based methodology
Golfarelli, Matteo
Rizzi, Stefano
INFORMATION AND SOFTWARE TECHNOLOGY, 2011, 53 (11) : 1183 - 1198
[35] Rethinking Semantic Segmentation: A Prototype View
Zhou, Tianfei
Wang, Wenguan
Konukoglu, Ender
Van Gool, Luc
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 2572 - 2583
[36] Learning Prototype-based Classifiers by Margin Maximization
Wakou, Chiharu
Kusunoki, Yoshifumi
Tatsumi, Keiji
2017 JOINT 17TH WORLD CONGRESS OF INTERNATIONAL FUZZY SYSTEMS ASSOCIATION AND 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (IFSA-SCIS), 2017,
[37] PROTOTYPE-BASED ASSESSMENT OF LAYPEOPLES VIEWS OF LOVE
FEHR, B
PERSONAL RELATIONSHIPS, 1994, 1 (04) : 309 - 331
[38] Hyperparameter learning in probabilistic prototype-based models
Schneider, Petra
Biehl, Michael
Hammer, Barbara
NEUROCOMPUTING, 2010, 73 (7-9) : 1117 - 1124
[39] Complexity reduction in efficient prototype-based classification
Ferri, FJ
Sánchez, JS
Pla, F
PATTERN RECOGNITION, 2006, 39 (02) : 161 - 163
[40] Prototype-Based Interpretable Graph Neural Networks
Ragno A.
Rosa B.L.
Capobianco R.
IEEE Transactions on Artificial Intelligence, 2024, 5 (04): : 1486 - 1495

← 1 2 3 4 5 →