A shape-aware enhancement Vision Transformer for building extraction from remote sensing imagery

被引:1
|
作者
Yiming, Tuerhong [1 ]
Tang, Xiaoyan [1 ,2 ]
Shang, Haibin [1 ]
机构
[1] Xinjiang Univ, Coll Civil Engn & Architecture, Urumqi, Xinjiang, Peoples R China
[2] Xinjiang Univ, Coll Civil Engn & Architecture, Urumqi 830000, Xinjiang, Peoples R China
关键词
Deep learning; building extraction; Vision Transformer; long-range independence; shape feature enhancement; FOOTPRINT EXTRACTION; SEGMENTATION; NETWORK;
D O I
10.1080/01431161.2024.2307325
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Convolutional neural networks (CNN) have been developed for several years in the field of extracting buildings from remote sensing images. Vision Transformer (ViT) has recently demonstrated superior performance over CNN, thanks to its ability to model long-range dependencies through self-attention mechanisms. However, most existing ViT models lack shape information enhancement for the building objects, resulting in insufficient fine-grained segmentation. To address this limitation, we construct an efficient dual-path ViT framework for building segmentation, termed shape-aware enhancement Vision Transformer (SAEViT). Our approach incorporates shape-aware enhancement module (SAEM) that perceives and enhances the shape features of buildings using multi-shapes of convolutional kernels. We also introduce multi-pooling channel attention (MPCA) to exploit channel-wise information without squeezing the channel dimension. Furthermore, we propose a progressive aggregation upsampling model (PAUM) in the decoder to aggregate multilevel features using a progressive upsampling methodology, coupled with the utilization of the soft-pool algorithm operating on the channel axis. We evaluate our model on three public building datasets. The experimental results show that SAEViT obtains a significant improvement on various datasets, confirming its efficacy. Compared with several state-of-the-art models, SAEViT achieves a comprehensive transcendence in overall performance.
引用
收藏
页码:1250 / 1276
页数:27
相关论文
共 50 条
  • [31] BUILDING EXTRACTION FROM HIGH-RESOLUTION REMOTE SENSING IMAGERY BASED ON MULTI-SCALE FEATURE FUSION AND ENHANCEMENT
    Chen, Y.
    Cheng, H.
    Yao, S.
    Hu, Z.
    XXIV ISPRS CONGRESS: IMAGING TODAY, FORESEEING TOMORROW, COMMISSION III, 2022, 43-B3 : 55 - 60
  • [32] UANet: An Uncertainty-Aware Network for Building Extraction From Remote Sensing Images
    Li, Jiepan
    He, Wei
    Cao, Weinan
    Zhang, Liangpei
    Zhang, Hongyan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 13
  • [33] Foreground-Aware Refinement Network for Building Extraction from Remote Sensing Images
    Zhang Yan
    Wang Xiangyu
    Zhang Zhongwei
    Sun Yemei
    Liu Shudong
    PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, 2022, 88 (11): : 731 - 738
  • [34] BUILDING EXTRACTION IN VHR REMOTE SENSING IMAGERY THROUGH DEEP LEARNING
    Atik, Saziye Ozge
    Ipbuker, Cengizhan
    FRESENIUS ENVIRONMENTAL BULLETIN, 2022, 31 (8A): : 8468 - 8473
  • [35] Building extraction from remote sensing images with deep learning: A survey on vision techniques
    Yuan, Yuan
    Shi, Xiaofeng
    Gao, Junyu
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2025, 251
  • [36] Discriminative Context-Aware Network for Target Extraction in Remote Sensing Imagery
    Hu, Lei
    Niu, Chuang
    Ren, Shenghan
    Dong, Minghao
    Zheng, Changli
    Zhang, Wei
    Liang, Jimin
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 700 - 715
  • [37] Research on building extraction from remote sensing imagery using efficient lightweight residual network
    Gao, Ai
    Yang, Guang
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [38] LEFORMER: A HYBRID CNN-TRANSFORMER ARCHITECTURE FOR ACCURATE LAKE EXTRACTION FROM REMOTE SENSING IMAGERY
    Chen, Ben
    Zou, Xuechao
    Zhang, Yu
    Li, Jiayu
    Li, Kai
    Xing, Junliang
    Tao, Pin
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5710 - 5714
  • [39] MSTrans: Multi-Scale Transformer for Building Extraction from HR Remote Sensing Images
    Yang, Fei
    Jiang, Fenlong
    Li, Jianzhao
    Lu, Lei
    ELECTRONICS, 2024, 13 (23):
  • [40] Structure-Aware Weakly Supervised Network for Building Extraction From Remote Sensing Images
    Chen, Hui
    Cheng, Liang
    Zhuang, Qizhi
    Zhang, Ka
    Li, Ning
    Liu, Lei
    Duan, Zhixin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60