A shape-aware enhancement Vision Transformer for building extraction from remote sensing imagery

被引:1
|
作者
Yiming, Tuerhong [1 ]
Tang, Xiaoyan [1 ,2 ]
Shang, Haibin [1 ]
机构
[1] Xinjiang Univ, Coll Civil Engn & Architecture, Urumqi, Xinjiang, Peoples R China
[2] Xinjiang Univ, Coll Civil Engn & Architecture, Urumqi 830000, Xinjiang, Peoples R China
关键词
Deep learning; building extraction; Vision Transformer; long-range independence; shape feature enhancement; FOOTPRINT EXTRACTION; SEGMENTATION; NETWORK;
D O I
10.1080/01431161.2024.2307325
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Convolutional neural networks (CNN) have been developed for several years in the field of extracting buildings from remote sensing images. Vision Transformer (ViT) has recently demonstrated superior performance over CNN, thanks to its ability to model long-range dependencies through self-attention mechanisms. However, most existing ViT models lack shape information enhancement for the building objects, resulting in insufficient fine-grained segmentation. To address this limitation, we construct an efficient dual-path ViT framework for building segmentation, termed shape-aware enhancement Vision Transformer (SAEViT). Our approach incorporates shape-aware enhancement module (SAEM) that perceives and enhances the shape features of buildings using multi-shapes of convolutional kernels. We also introduce multi-pooling channel attention (MPCA) to exploit channel-wise information without squeezing the channel dimension. Furthermore, we propose a progressive aggregation upsampling model (PAUM) in the decoder to aggregate multilevel features using a progressive upsampling methodology, coupled with the utilization of the soft-pool algorithm operating on the channel axis. We evaluate our model on three public building datasets. The experimental results show that SAEViT obtains a significant improvement on various datasets, confirming its efficacy. Compared with several state-of-the-art models, SAEViT achieves a comprehensive transcendence in overall performance.
引用
收藏
页码:1250 / 1276
页数:27
相关论文
共 50 条
  • [11] Cross-level and multiscale CNN-Transformer network for automatic building extraction from remote sensing imagery
    Yuan, Qinglie
    Xia, Bin
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2024, 45 (09) : 2893 - 2914
  • [12] On The Exploration of Vision Transformers in Remote Sensing Building Extraction
    Angelis, G. F.
    Domi, A.
    Zamichos, A.
    Tsourma, M.
    Drosou, A.
    Tzovaras, D.
    2022 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2022, : 208 - +
  • [13] A Dual-Branch Fusion Network Based on Reconstructed Transformer for Building Extraction in Remote Sensing Imagery
    Wang, Yitong
    Wang, Shumin
    Dou, Aixia
    SENSORS, 2024, 24 (02)
  • [14] Study on hierarchical building extraction from high resolution remote sensing imagery
    You Y.
    Wang S.
    Wang B.
    Ma Y.
    Shen M.
    Liu W.
    Xiao L.
    Yaogan Xuebao/Journal of Remote Sensing, 2019, 23 (01): : 125 - 136
  • [15] Building area extraction from the high spatial resolution remote sensing imagery
    Shi, Wenzao
    Mao, Zhengyuan
    Liu, Jinqing
    EARTH SCIENCE INFORMATICS, 2019, 12 (01) : 19 - 29
  • [16] Building area extraction from the high spatial resolution remote sensing imagery
    Wenzao Shi
    Zhengyuan Mao
    Jinqing Liu
    Earth Science Informatics, 2019, 12 : 19 - 29
  • [17] Road Extraction from Remote Sensing Imagery with Spatial Attention Based on Swin Transformer
    Zhu, Xianhong
    Huang, Xiaohui
    Cao, Weijia
    Yang, Xiaofei
    Zhou, Yunfei
    Wang, Shaokai
    REMOTE SENSING, 2024, 16 (07)
  • [18] Automated building extraction using satellite remote sensing imagery
    Hu, Qintao
    Zhen, Liangli
    Mao, Yao
    Zhou, Xi
    Zhou, Guozhong
    AUTOMATION IN CONSTRUCTION, 2021, 123
  • [19] Extracting Building Footprint From Remote Sensing Images by an Enhanced Vision Transformer Network
    Zhang, Hua
    Dou, Hu
    Miao, Zelang
    Zheng, Nanshan
    Hao, Ming
    Shi, Wenzhong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [20] Information extraction from remote sensing imagery
    Huang, Xin
    Li, Jiayi
    Liao, Wenzhi
    Chanussot, Jocelyn
    Geo-Spatial Information Science, 2017, 20 (04) : 297 - 298