Local optimization cropping and boundary enhancement for end-to-end weakly-supervised segmentation network

被引:0
|
作者
Wang, Weizheng [1 ]
Zeng, Chao [1 ]
Wang, Haonan [1 ]
Zhou, Lei [1 ]
机构
[1] Changsha Univ Sci & Technol, Changsha 410000, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Weakly-supervised semantic segmentation; Computer vision; Single-stage; Boundary enhancement; Local optimization cropping; CONVOLUTIONAL NETWORKS;
D O I
10.1016/j.cviu.2024.104260
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the performance of weakly-supervised semantic segmentation(WSSS) has significantly increased. It usually employs image-level labels to generate Class Activation Map (CAM) for producing pseudo-labels, which greatly reduces the cost of annotation. Since CNN cannot fully identify object regions, researchers found that Vision Transformers (ViT) can complement the deficiencies of CNN by better extracting global contextual information. However, ViT also introduces the problem of over-smoothing. Great progress has been made in recent years to solve the over-smoothing problem, yet two issues remain. The first issue is that the high-confidence regions in the network-generated CAM still contain areas irrelevant to the class. The second issue is the inaccuracy of CAM boundaries, which contain a small portion of background regions. As we know, the precision of label boundaries is closely tied to excellent segmentation performance. In this work, to address the first issue, we propose a local optimized cropping module (LOC). By randomly cropping selected regions, we allow the local class tokens to be contrasted with the global class tokens. This method facilitates enhanced consistency between local and global representations. To address the second issue, we design a boundary enhancement module (BE) that utilizes an erasing strategy to re-train the image, increasing the network's extraction of boundary information and greatly improving the accuracy of CAM boundaries, thereby enhancing the quality of pseudo labels. Experiments on the PASCAL VOC dataset show that the performance of our proposed LOC-BE Net outperforms multi-stage methods and is competitive with end-to-end methods. On the PASCAL VOC dataset, our method achieves a CAM mIoU of 74.2% and a segmentation mIoU of 73.1%. On the COCO2014 dataset, our method achieves a CAM mIoU of 43.8% and a segmentation mIoU of 43.4%. Our code has been open sourced: https://github.com/whn786/LOC-BE/tree/main.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Saliency Background Guided Network for Weakly-Supervised Semantic Segmentation
    Bai X.
    Li W.
    Wang W.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2021, 34 (09): : 824 - 835
  • [32] Saliency guided deep network for weakly-supervised image segmentation
    Sun, Fengdong
    Li, Wenhui
    PATTERN RECOGNITION LETTERS, 2019, 120 : 62 - 68
  • [33] Deep graph cut network for weakly-supervised semantic segmentation
    Feng, Jiapei
    Wang, Xinggang
    Liu, Wenyu
    SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (03)
  • [34] Deep graph cut network for weakly-supervised semantic segmentation
    Jiapei FENG
    Xinggang WANG
    Wenyu LIU
    ScienceChina(InformationSciences), 2021, 64 (03) : 57 - 68
  • [35] Deep graph cut network for weakly-supervised semantic segmentation
    Jiapei Feng
    Xinggang Wang
    Wenyu Liu
    Science China Information Sciences, 2021, 64
  • [36] Autonomous Navigation for Mobile Robots with Weakly-Supervised Segmentation Network
    Huang, Peinan
    Li, Jialun
    He, Jianping
    2022 IEEE 96TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2022-FALL), 2022,
  • [37] Self-supervised end-to-end graph local clustering
    Zhe Yuan
    World Wide Web, 2023, 26 : 1157 - 1179
  • [38] Self-supervised end-to-end graph local clustering
    Yuan, Zhe
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (03): : 1157 - 1179
  • [39] End-to-end trainable network for superpixel and image segmentation
    Wang, Kai
    Li, Liang
    Zhang, Jiawan
    PATTERN RECOGNITION LETTERS, 2020, 140 (135-142) : 135 - 142
  • [40] Face attribute recognition via end-to-end weakly supervised regional location
    Jian Shi
    Ge Sun
    Jinyu Zhang
    Zhihui Wang
    Haojie Li
    Multimedia Systems, 2023, 29 : 2137 - 2152