A shape-aware enhancement Vision Transformer for building extraction from remote sensing imagery

被引:1
|
作者
Yiming, Tuerhong [1 ]
Tang, Xiaoyan [1 ,2 ]
Shang, Haibin [1 ]
机构
[1] Xinjiang Univ, Coll Civil Engn & Architecture, Urumqi, Xinjiang, Peoples R China
[2] Xinjiang Univ, Coll Civil Engn & Architecture, Urumqi 830000, Xinjiang, Peoples R China
关键词
Deep learning; building extraction; Vision Transformer; long-range independence; shape feature enhancement; FOOTPRINT EXTRACTION; SEGMENTATION; NETWORK;
D O I
10.1080/01431161.2024.2307325
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Convolutional neural networks (CNN) have been developed for several years in the field of extracting buildings from remote sensing images. Vision Transformer (ViT) has recently demonstrated superior performance over CNN, thanks to its ability to model long-range dependencies through self-attention mechanisms. However, most existing ViT models lack shape information enhancement for the building objects, resulting in insufficient fine-grained segmentation. To address this limitation, we construct an efficient dual-path ViT framework for building segmentation, termed shape-aware enhancement Vision Transformer (SAEViT). Our approach incorporates shape-aware enhancement module (SAEM) that perceives and enhances the shape features of buildings using multi-shapes of convolutional kernels. We also introduce multi-pooling channel attention (MPCA) to exploit channel-wise information without squeezing the channel dimension. Furthermore, we propose a progressive aggregation upsampling model (PAUM) in the decoder to aggregate multilevel features using a progressive upsampling methodology, coupled with the utilization of the soft-pool algorithm operating on the channel axis. We evaluate our model on three public building datasets. The experimental results show that SAEViT obtains a significant improvement on various datasets, confirming its efficacy. Compared with several state-of-the-art models, SAEViT achieves a comprehensive transcendence in overall performance.
引用
收藏
页码:1250 / 1276
页数:27
相关论文
共 50 条
  • [1] SANET: A Shape-Aware Building Footprints Extraction Method in Remote Sensing Images by Integrating Fourier Shape Descriptors
    Hu, Anna
    Wu, Liang
    Xu, Yongyang
    Xie, Zhong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 1
  • [2] CAFormer: a connectivity-aware vision transformer for road extraction from remote sensing images
    Wang, Xite
    Qin, Changsheng
    Bai, Mei
    Ma, Qian
    Li, Guanyu
    VISUAL COMPUTER, 2025,
  • [3] Instance-Aware Contour Learning for Vectorized Building Extraction From Remote Sensing Imagery
    Huang, Xingliang
    Chen, Kaiqiang
    Wang, Zhirui
    Sun, Xian
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 12745 - 12759
  • [4] QUALITY ASSESSMENT OF BUILDING EXTRACTION FROM REMOTE SENSING IMAGERY
    Avbelj, Janja
    Mueller, Rupert
    2014 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2014,
  • [5] A Context-Aware Road Extraction Method for Remote Sensing Imagery Based on Transformer Network
    Zhang, Xiaokai
    Ma, Xianzhi
    Yang, Zhigang
    Liu, Xilin
    Chen, Zehua
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [6] Edge-aware transformer for coastal raft aquaculture extraction in optical remote sensing imagery
    Su, Hua
    Liu, Yuxin
    Huang, Zhanchao
    Wang, An
    Hong, Wenjun
    Cai, Junchao
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2025, 18 (01)
  • [7] A Context Feature Enhancement Network for Building Extraction from High-Resolution Remote Sensing Imagery
    Chen, Jinzhi
    Zhang, Dejun
    Wu, Yiqi
    Chen, Yilin
    Yan, Xiaohu
    REMOTE SENSING, 2022, 14 (09)
  • [8] SAPFormer: Shape-aware propagation Transformer for point clouds
    Xiao, Gang
    Ge, Sihan
    Zhong, Yangsheng
    Xiao, Zhongcheng
    Song, Junfeng
    Lu, Jiawei
    PATTERN RECOGNITION, 2025, 164
  • [9] Shape preserving edge enhancement in remote sensing imagery
    Chowdhury, MS
    Clausi, DA
    IGARSS 2002: IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM AND 24TH CANADIAN SYMPOSIUM ON REMOTE SENSING, VOLS I-VI, PROCEEDINGS: REMOTE SENSING: INTEGRATING OUR VIEW OF THE PLANET, 2002, : 1450 - 1452
  • [10] BEMRF-Net: Boundary Enhancement and Multiscale Refinement Fusion for Building Extraction From Remote Sensing Imagery
    Cao, Shaohan
    Feng, Dejun
    Liu, Suning
    Xu, Wanqi
    Chen, Hongyu
    Xie, Yakun
    Zhang, Heng
    Pirasteh, Saied
    Zhu, Jun
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 16342 - 16358