A shape-aware enhancement Vision Transformer for building extraction from remote sensing imagery

被引:1
|
作者
Yiming, Tuerhong [1 ]
Tang, Xiaoyan [1 ,2 ]
Shang, Haibin [1 ]
机构
[1] Xinjiang Univ, Coll Civil Engn & Architecture, Urumqi, Xinjiang, Peoples R China
[2] Xinjiang Univ, Coll Civil Engn & Architecture, Urumqi 830000, Xinjiang, Peoples R China
关键词
Deep learning; building extraction; Vision Transformer; long-range independence; shape feature enhancement; FOOTPRINT EXTRACTION; SEGMENTATION; NETWORK;
D O I
10.1080/01431161.2024.2307325
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Convolutional neural networks (CNN) have been developed for several years in the field of extracting buildings from remote sensing images. Vision Transformer (ViT) has recently demonstrated superior performance over CNN, thanks to its ability to model long-range dependencies through self-attention mechanisms. However, most existing ViT models lack shape information enhancement for the building objects, resulting in insufficient fine-grained segmentation. To address this limitation, we construct an efficient dual-path ViT framework for building segmentation, termed shape-aware enhancement Vision Transformer (SAEViT). Our approach incorporates shape-aware enhancement module (SAEM) that perceives and enhances the shape features of buildings using multi-shapes of convolutional kernels. We also introduce multi-pooling channel attention (MPCA) to exploit channel-wise information without squeezing the channel dimension. Furthermore, we propose a progressive aggregation upsampling model (PAUM) in the decoder to aggregate multilevel features using a progressive upsampling methodology, coupled with the utilization of the soft-pool algorithm operating on the channel axis. We evaluate our model on three public building datasets. The experimental results show that SAEViT obtains a significant improvement on various datasets, confirming its efficacy. Compared with several state-of-the-art models, SAEViT achieves a comprehensive transcendence in overall performance.
引用
收藏
页码:1250 / 1276
页数:27
相关论文
共 50 条
  • [41] Adversarial Shape Learning for Building Extraction in VHR Remote Sensing Images
    Ding, Lei
    Tang, Hao
    Liu, Yahui
    Shi, Yilei
    Zhu, Xiao Xiang
    Bruzzone, Lorenzo
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 678 - 690
  • [42] OPEN-SET CLASSIFICATION IN REMOTE SENSING IMAGERY WITH ENERGY-BASED VISION TRANSFORMER
    Al-Dayil, Reham
    Bazi, Yakoub
    Alajlan, Naif
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 2211 - 2214
  • [43] PCViT: A Pyramid Convolutional Vision Transformer Detector for Object Detection in Remote-Sensing Imagery
    Li, Jiaojiao
    Tian, Penghao
    Song, Rui
    Xu, Haitao
    Li, Yunsong
    Du, Qian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [44] Binary Quantization Vision Transformer for Effective Segmentation of Red Tide in Multispectral Remote Sensing Imagery
    Xie, Yefan
    Hou, Xuan
    Ren, Jinchang
    Zhang, Xinchao
    Ma, Chengcheng
    Zheng, Jiangbin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [45] From lines to Polygons: Polygonal building contour extraction from High-Resolution remote sensing imagery
    Wei, Shiqing
    Zhang, Tao
    Yu, Dawen
    Ji, Shunping
    Zhang, Yongjun
    Gong, Jianya
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2024, 209 (213-232) : 213 - 232
  • [46] A feature enhancement network combining UNet and vision transformer for building change detection in high-resolution remote sensing images
    Yu Sun
    Yujuan Zhao
    Xianwei Han
    Wei Gao
    Yunliang Hu
    Yimin Zhang
    Neural Computing and Applications, 2025, 37 (3) : 1429 - 1456
  • [47] Pattern Integration and Enhancement Vision Transformer for Self-Supervised Learning in Remote Sensing
    Lu, Kaixuan
    Zhang, Ruiqian
    Huang, Xiao
    Xie, Yuxing
    Ning, Xiaogang
    Zhang, Hanchao
    Yuan, Mengke
    Zhang, Pan
    Wang, Tao
    Liao, Tongkui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [48] B-FGC-Net: A Building Extraction Network from High Resolution Remote Sensing Imagery
    Wang, Yong
    Zeng, Xiangqiang
    Liao, Xiaohan
    Zhuang, Dafang
    REMOTE SENSING, 2022, 14 (02)
  • [49] Fusion of deep learning with adaptive bilateral filter for building outline extraction from remote sensing imagery
    Masouleh, Mehdi Khoshboresh
    Shah-Hosseini, Reza
    JOURNAL OF APPLIED REMOTE SENSING, 2018, 12 (04)
  • [50] Building Extraction from Remote Sensing Imagery Based on Scale-Adaptive Fully Convolutional Network
    Feng Fan
    Wang Shuangting
    Zhang Jin
    Wang Chunyang
    Liu Bing
    LASER & OPTOELECTRONICS PROGRESS, 2021, 58 (24)