Contrastive Tokens and Label Activation for Remote Sensing Weakly Supervised Semantic Segmentation

被引:2
|
作者
Hu, Zaiyi [1 ]
Gao, Junyu [1 ,2 ]
Yuan, Yuan [1 ]
Li, Xuelong [3 ]
机构
[1] Northwestern Polytech Univ, Sch Artificial Intelligence Opt & Elect iOPEN, Xian 710072, Peoples R China
[2] Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
[3] China Telecom Corp Ltd, Inst Artificial Intelligence TeleAI, Beijing 100033, Peoples R China
关键词
Remote sensing; Semantic segmentation; Training; Task analysis; Semantics; Convolutional neural networks; Transformers; Deep learning; remote sensing images; vision transformer (ViT); weakly supervised semantic segmentation (WSSS);
D O I
10.1109/TGRS.2024.3385747
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
In recent years, there has been remarkable progress in weakly supervised semantic segmentation (WSSS), with vision transformer (ViT) architectures emerging as a natural fit for such tasks due to their inherent ability to leverage global attention for comprehensive object information perception. However, directly applying ViT to WSSS tasks can introduce challenges. The characteristics of ViT can lead to an oversmoothing problem, particularly in dense scenes of remote sensing images, significantly compromising the effectiveness of class activation maps (CAMs) and posing challenges for segmentation. Moreover, existing methods often adopt multistage strategies, adding complexity and reducing training efficiency. To overcome these challenges, a comprehensive framework Contrastive Token and Foreground Activation (CTFA) based on the ViT architecture for WSSS of remote sensing images is presented. Our proposed method includes a contrastive token learning module (CTLM), incorporating both patch-wise and class-wise token learning to enhance model performance. In patch-wise learning, we leverage the semantic diversity preserved in intermediate layers of ViT and derive a relation matrix from these layers and employ it to supervise the final output tokens, thereby improving the quality of CAM. In class-wise learning, we ensure the consistency of representation between global and local tokens, revealing more entire object regions. Additionally, by activating foreground features in the generated pseudo label using a dual-branch decoder, we further promote the improvement of CAM generation. Our approach demonstrates outstanding results across three well-established datasets, providing a more efficient and streamlined solution for WSSS. Code will be available at: https://github.com/ZaiyiHu/CTFA.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 50 条
  • [41] Background Activation Suppression for Weakly Supervised Object Localization and Semantic Segmentation
    Zhai, Wei
    Wu, Pingyu
    Zhu, Kai
    Cao, Yang
    Wu, Feng
    Zha, Zheng-Jun
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (03) : 750 - 775
  • [42] Hierarchical Augmentation and Region-Aware Contrastive Learning for Semi-Supervised Semantic Segmentation of Remote Sensing Images
    Luo, Yuan
    Sun, Bin
    Li, Shutao
    Hu, Yulong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [43] Learning deep semantic segmentation network under multiple weakly-supervised constraints for cross-domain remote sensing image semantic segmentation
    Li, Yansheng
    Shi, Te
    Zhang, Yongjun
    Chen, Wei
    Wang, Zhibin
    Li, Hao
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 175 : 20 - 33
  • [44] Dense Supervised Dual-Aware Contrastive Learning for Airborne Laser Scanning Weakly Supervised Semantic Segmentation
    Luo, Ziwei
    Zeng, Tao
    Jiang, Xinyi
    Peng, Qingyu
    Ma, Ying
    Xie, Zhong
    Pan, Xiong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [45] Hierarchical Semantic Contrast for Weakly Supervised Semantic Segmentation
    Wu, Yuanchen
    Li, Xiaoqiang
    Dai, Songmin
    Li, Jide
    Liu, Tong
    Xie, Shaorong
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1542 - 1550
  • [46] Remote Sensing Image Coding for Machines on Semantic Segmentation via Contrastive Learning
    Zhang, Junxi
    Chen, Zhenzhong
    Liu, Shan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [47] Contrastive Learning for Label Efficient Semantic Segmentation
    Zhao, Xiangyun
    Vemulapalli, Raviteja
    Mansfield, Philip Andrew
    Gong, Boqing
    Green, Bradley
    Shapira, Lior
    Wu, Ying
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10603 - 10613
  • [48] Adaptive Multitype Contrastive Views Generation for Remote Sensing Image Semantic Segmentation
    Shi, Cheng
    Han, Peiwen
    Zhao, Minghua
    Fang, Li
    Miao, Qiguang
    Pun, Chi-Man
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [49] Pseudo-Label-Free Weakly Supervised Semantic Segmentation Using Image Masking
    Kim, Sangtae
    Luong Trung Nguyen
    Shim, Kyuhong
    Kim, Junhan
    Shim, Byonghyo
    IEEE ACCESS, 2022, 10 : 19401 - 19411
  • [50] Extraction of Erigeron breviscapus Planting Information by Unmanned Aerial Vehicle Remote Sensing Based on Weakly Supervised Semantic Segmentation
    Huang L.
    Wu C.
    Li X.
    Yang W.
    Yao W.
    Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2022, 53 (04): : 157 - 163and217