Local optimization cropping and boundary enhancement for end-to-end weakly-supervised segmentation network

被引：0

作者：

Wang, Weizheng ^{[1
]}

Zeng, Chao ^{[1
]}

Wang, Haonan ^{[1
]}

Zhou, Lei ^{[1
]}

机构：

[1] Changsha Univ Sci & Technol, Changsha 410000, Peoples R China

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2025年 / 251卷

基金：

中国国家自然科学基金;

关键词：

Deep learning; Weakly-supervised semantic segmentation; Computer vision; Single-stage; Boundary enhancement; Local optimization cropping; CONVOLUTIONAL NETWORKS;

D O I：

10.1016/j.cviu.2024.104260

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, the performance of weakly-supervised semantic segmentation(WSSS) has significantly increased. It usually employs image-level labels to generate Class Activation Map (CAM) for producing pseudo-labels, which greatly reduces the cost of annotation. Since CNN cannot fully identify object regions, researchers found that Vision Transformers (ViT) can complement the deficiencies of CNN by better extracting global contextual information. However, ViT also introduces the problem of over-smoothing. Great progress has been made in recent years to solve the over-smoothing problem, yet two issues remain. The first issue is that the high-confidence regions in the network-generated CAM still contain areas irrelevant to the class. The second issue is the inaccuracy of CAM boundaries, which contain a small portion of background regions. As we know, the precision of label boundaries is closely tied to excellent segmentation performance. In this work, to address the first issue, we propose a local optimized cropping module (LOC). By randomly cropping selected regions, we allow the local class tokens to be contrasted with the global class tokens. This method facilitates enhanced consistency between local and global representations. To address the second issue, we design a boundary enhancement module (BE) that utilizes an erasing strategy to re-train the image, increasing the network's extraction of boundary information and greatly improving the accuracy of CAM boundaries, thereby enhancing the quality of pseudo labels. Experiments on the PASCAL VOC dataset show that the performance of our proposed LOC-BE Net outperforms multi-stage methods and is competitive with end-to-end methods. On the PASCAL VOC dataset, our method achieves a CAM mIoU of 74.2% and a segmentation mIoU of 73.1%. On the COCO2014 dataset, our method achieves a CAM mIoU of 43.8% and a segmentation mIoU of 43.4%. Our code has been open sourced: https://github.com/whn786/LOC-BE/tree/main.

引用

页数：12

共 50 条

[31] Saliency Background Guided Network for Weakly-Supervised Semantic Segmentation
Bai X.
Li W.
Wang W.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2021, 34 (09): : 824 - 835
[32] Saliency guided deep network for weakly-supervised image segmentation
Sun, Fengdong
Li, Wenhui
PATTERN RECOGNITION LETTERS, 2019, 120 : 62 - 68
[33] Deep graph cut network for weakly-supervised semantic segmentation
Feng, Jiapei
Wang, Xinggang
Liu, Wenyu
SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (03)
[34] Deep graph cut network for weakly-supervised semantic segmentation
Jiapei FENG
Xinggang WANG
Wenyu LIU
ScienceChina(InformationSciences), 2021, 64 (03) : 57 - 68
[35] Deep graph cut network for weakly-supervised semantic segmentation
Jiapei Feng
Xinggang Wang
Wenyu Liu
Science China Information Sciences, 2021, 64
[36] Autonomous Navigation for Mobile Robots with Weakly-Supervised Segmentation Network
Huang, Peinan
Li, Jialun
He, Jianping
2022 IEEE 96TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2022-FALL), 2022,
[37] Self-supervised end-to-end graph local clustering
Zhe Yuan
World Wide Web, 2023, 26 : 1157 - 1179
[38] Self-supervised end-to-end graph local clustering
Yuan, Zhe
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (03): : 1157 - 1179
[39] End-to-end trainable network for superpixel and image segmentation
Wang, Kai
Li, Liang
Zhang, Jiawan
PATTERN RECOGNITION LETTERS, 2020, 140 (135-142) : 135 - 142
[40] Face attribute recognition via end-to-end weakly supervised regional location
Jian Shi
Ge Sun
Jinyu Zhang
Zhihui Wang
Haojie Li
Multimedia Systems, 2023, 29 : 2137 - 2152

← 1 2 3 4 5 →