Local optimization cropping and boundary enhancement for end-to-end weakly-supervised segmentation network

被引:0
|
作者
Wang, Weizheng [1 ]
Zeng, Chao [1 ]
Wang, Haonan [1 ]
Zhou, Lei [1 ]
机构
[1] Changsha Univ Sci & Technol, Changsha 410000, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Weakly-supervised semantic segmentation; Computer vision; Single-stage; Boundary enhancement; Local optimization cropping; CONVOLUTIONAL NETWORKS;
D O I
10.1016/j.cviu.2024.104260
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the performance of weakly-supervised semantic segmentation(WSSS) has significantly increased. It usually employs image-level labels to generate Class Activation Map (CAM) for producing pseudo-labels, which greatly reduces the cost of annotation. Since CNN cannot fully identify object regions, researchers found that Vision Transformers (ViT) can complement the deficiencies of CNN by better extracting global contextual information. However, ViT also introduces the problem of over-smoothing. Great progress has been made in recent years to solve the over-smoothing problem, yet two issues remain. The first issue is that the high-confidence regions in the network-generated CAM still contain areas irrelevant to the class. The second issue is the inaccuracy of CAM boundaries, which contain a small portion of background regions. As we know, the precision of label boundaries is closely tied to excellent segmentation performance. In this work, to address the first issue, we propose a local optimized cropping module (LOC). By randomly cropping selected regions, we allow the local class tokens to be contrasted with the global class tokens. This method facilitates enhanced consistency between local and global representations. To address the second issue, we design a boundary enhancement module (BE) that utilizes an erasing strategy to re-train the image, increasing the network's extraction of boundary information and greatly improving the accuracy of CAM boundaries, thereby enhancing the quality of pseudo labels. Experiments on the PASCAL VOC dataset show that the performance of our proposed LOC-BE Net outperforms multi-stage methods and is competitive with end-to-end methods. On the PASCAL VOC dataset, our method achieves a CAM mIoU of 74.2% and a segmentation mIoU of 73.1%. On the COCO2014 dataset, our method achieves a CAM mIoU of 43.8% and a segmentation mIoU of 43.4%. Our code has been open sourced: https://github.com/whn786/LOC-BE/tree/main.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Weakly supervised end-to-end artificial intelligence in gastrointestinal endoscopy
    Lukas Buendgens
    Didem Cifci
    Narmin Ghaffari Laleh
    Marko van Treeck
    Maria T. Koenen
    Henning W. Zimmermann
    Till Herbold
    Thomas Joachim Lux
    Alexander Hann
    Christian Trautwein
    Jakob Nikolas Kather
    Scientific Reports, 12
  • [22] Weakly supervised end-to-end artificial intelligence in gastrointestinal endoscopy
    Buendgens, Lukas
    Cifci, Didem
    Laleh, Narmin Ghaffari
    van Treeck, Marko
    Koenen, Maria T.
    Zimmermann, Henning W.
    Herbold, Till
    Lux, Thomas Joachim
    Hann, Alexander
    Trautwein, Christian
    Kather, Jakob Nikolas
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [23] Weakly-Supervised Semantic Segmentation Network With Iterative dCRF
    Li, Yujie
    Sun, Jiaxing
    Li, Yun
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) : 25419 - 25426
  • [24] An End-to-End Mutual Enhancement Network Toward Image Compression and Semantic Segmentation
    Chen, Junru
    Yao, Chao
    Liu, Meiqin
    Zhao, Yao
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2021, PT II, 2021, 13020 : 623 - 635
  • [25] Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment
    Ding, Li
    Xu, Chenliang
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6508 - 6516
  • [26] Saliency Guided End-to-End Learning for Weakly Supervised Object Detection
    Lai, Baisheng
    Gong, Xiaojin
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2053 - 2059
  • [27] End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution
    Liao, Mingxiang
    Wan, Fang
    Yao, Yuan
    Han, Zhenjun
    Zou, Jialing
    Wang, Yuze
    Feng, Bailan
    Yuan, Peng
    Ye, Qixiang
    COMPUTER VISION, ECCV 2022, PT IX, 2022, 13669 : 210 - 226
  • [28] Rethinking Self-Supervised Semantic Segmentation: Achieving End-to-End Segmentation
    Liu, Yue
    Zeng, Jun
    Tao, Xingzhen
    Fang, Gang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 10036 - 10046
  • [29] EBIT: Weakly-supervised image translation with edge and boundary enhancement
    Wang, Tianren
    Zhang, Teng
    Lovell, Brian C.
    PATTERN RECOGNITION LETTERS, 2020, 138 : 534 - 539
  • [30] Multi-class Token-Guided End-to-End Weakly Supervised Image Semantic Segmentation Method
    Cao, Yifan
    He, Lijun
    Ma, Ting
    Li, Fan
    PATTERN RECOGNITION AND COMPUTER VISION, PT XIII, PRCV 2024, 2025, 15043 : 93 - 106