PCSformer: Pair-wise Cross-scale Sub-prototypes mining with CNN-transformers for weakly supervised semantic segmentation

被引:0
|
作者
Liu, Chunmeng [1 ]
Shen, Yao [1 ]
Xiao, Qingguo [2 ]
Li, Guangyao [1 ]
机构
[1] Tongji Univ, 4800 Caoan Highway, Shanghai 201804, Peoples R China
[2] Linyi Univ, Shuangling Rd, Linyi 276000, Shandong, Peoples R China
关键词
Weakly supervised learning; Semantic segmentation; Transformer; Deep learning; Computer vision; NETWORK;
D O I
10.1016/j.neucom.2024.127834
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generating initial seeds is an important step in weakly supervised semantic segmentation (WSSS). Our approach concentrates on generating and refining initial seeds. The convolutional neural networks (CNNs)-based initial seeds focus only on the most discriminative regions and lack global information about the target. The Vision Transformer (ViT)-based approach can capture long-range feature dependencies due to the unique advantage of the self-attention mechanism. Still, we find that it suffers from distractor object leakage and background leakage problems. Based on these observations, we propose PCSformer, which improves the model's ability to extract features through a Pair-wise Cross-scale (PC) strategy and solves the problem of distractor object leakage by further extracting potential target features through Sub-Prototypes (SP) mining. In addition, the proposed Conflict Self-Elimination (CSE) module further alleviates the background leakage problem. We validate our approach on the widely adopted Pascal VOC 2012 and MS COCO 2014, and extensive experiments demonstrate our superior performance. Furthermore, our method proves to be competitive for WSSS in medical images and challenging scenarios involving deformable and cluttered scenes. Additionally, we extend the PCSformer to weakly supervised object localization tasks, further highlighting its scalability and versatility.
引用
收藏
页数:13
相关论文
共 2 条
  • [1] MECPformer: multi-estimations complementary patch with CNN-transformers for weakly supervised semantic segmentation
    Liu, Chunmeng
    Li, Guangyao
    Shen, Yao
    Wang, Ruiqi
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (31): : 23249 - 23264
  • [2] MECPformer: multi-estimations complementary patch with CNN-transformers for weakly supervised semantic segmentation
    Chunmeng Liu
    Guangyao Li
    Yao Shen
    Ruiqi Wang
    Neural Computing and Applications, 2023, 35 : 23249 - 23264