CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation

被引：42

作者：

Yu, Qihang ^{[1
,4
]}

Wang, Huiyu ^{[1
]}

Kim, Dahun ^{[2
]}

Qiao, Siyuan ^{[3
]}

Collins, Maxwell ^{[3
]}

Zhu, Yukun ^{[3
]}

Adam, Hartwig ^{[3
]}

Yuille, Alan ^{[1
]}

Chen, Liang-Chieh ^{[3
]}

机构：

[1] Johns Hopkins Univ, Baltimore, MD 21218 USA

[2] Korea Adv Inst Sci & Technol, Daejeon, South Korea

[3] Google Res, Mountain View, CA USA

[4] Google, Mountain View, CA 94043 USA

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

关键词：

D O I：

10.1109/CVPR52688.2022.00259

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose Clustering Mask Transformer (CMT-DeepLab), a transformer-based framework for panoptic segmentation designed around clustering. It rethinks the existing transformer architectures used in segmentation and detection; CMT-DeepLab considers the object queries as cluster centers, which fill the role of grouping the pixels when applied to segmentation. The clustering is computed with an alternating procedure, by first assigning pixels to the clusters by their feature affinity, and then updating the cluster centers and pixel features. Together, these operations comprise the Clustering Mask Transformer (CMT) layer, which produces cross-attention that is denser and more consistent with the final segmentation task. CMT-DeepLab improves the performance over prior art significantly by 4.4% PQ, achieving a new state-of-the-art of 55.7% PQ on the COCO test-dev set.

引用

页码：2550 / 2560

页数：11

共 26 条

[1] MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
Wang, Huiyu
Zhu, Yukun
Adam, Hartwig
Yuille, Alan
Chen, Liang-Chieh
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5459 - 5470
[2] Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers
Li, Zhiqi
Wang, Wenhai
Xie, Enze
Yu, Zhiding
Anandkumar, Anima
Alvarez, Jose M.
Luo, Ping
Lu, Tong
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1270 - 1279
[3] DEEP MARKOV CLUSTERING FOR PANOPTIC SEGMENTATION
Ye, Minxiang
Zhang, Yifei
Zhu, Shiqiang
Xie, Anhuan
Zhang, Dan
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2380 - 2384
[4] Time-Space Transformers for Video Panoptic Segmentation
Petrovai, Andra
Nedevschi, Sergiu
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2643 - 2652
[5] ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation
Qiao, Siyuan
Zhu, Yukun
Adam, Hartwig
Yuille, Alan
Chen, Liang-Chieh
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3996 - 4007
[6] A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Van Gansbeke, Wouter
De Brabandere, Bert
COMPUTER VISION - ECCV 2024, PT XV, 2025, 15073 : 78 - 97
[7] Mask-Pyramid Network: A Novel Panoptic Segmentation Method
Xian, Peng-Fei
Po, Lai-Man
Xiong, Jing-Jing
Zhao, Yu-Zhi
Yu, Wing-Yin
Cheung, Kwok-Wai
SENSORS, 2024, 24 (05)
[8] Mask-Based Panoptic LiDAR Segmentation for Autonomous Driving
Marcuzzi, Rodrigo
Nunes, Lucas
Wiesmann, Louis
Behley, Jens
Stachniss, Cyrill
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (02) : 1141 - 1148
[9] Panoptic Segmentation of UAV Images with Deformable Convolution Network and Mask Scoring
Chen, Hongwei
Ding, Laihui
Yao, Fengqin
Ren, Pengfei
Wang, Shengke
TWELFTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2020), 2021, 11720
[10] Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering
Robert, Damien
Raguet, Hugo
Landrieu, Loic
2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 179 - 189

← 1 2 3 →