A shape-aware enhancement Vision Transformer for building extraction from remote sensing imagery

被引:1
|
作者
Yiming, Tuerhong [1 ]
Tang, Xiaoyan [1 ,2 ]
Shang, Haibin [1 ]
机构
[1] Xinjiang Univ, Coll Civil Engn & Architecture, Urumqi, Xinjiang, Peoples R China
[2] Xinjiang Univ, Coll Civil Engn & Architecture, Urumqi 830000, Xinjiang, Peoples R China
关键词
Deep learning; building extraction; Vision Transformer; long-range independence; shape feature enhancement; FOOTPRINT EXTRACTION; SEGMENTATION; NETWORK;
D O I
10.1080/01431161.2024.2307325
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Convolutional neural networks (CNN) have been developed for several years in the field of extracting buildings from remote sensing images. Vision Transformer (ViT) has recently demonstrated superior performance over CNN, thanks to its ability to model long-range dependencies through self-attention mechanisms. However, most existing ViT models lack shape information enhancement for the building objects, resulting in insufficient fine-grained segmentation. To address this limitation, we construct an efficient dual-path ViT framework for building segmentation, termed shape-aware enhancement Vision Transformer (SAEViT). Our approach incorporates shape-aware enhancement module (SAEM) that perceives and enhances the shape features of buildings using multi-shapes of convolutional kernels. We also introduce multi-pooling channel attention (MPCA) to exploit channel-wise information without squeezing the channel dimension. Furthermore, we propose a progressive aggregation upsampling model (PAUM) in the decoder to aggregate multilevel features using a progressive upsampling methodology, coupled with the utilization of the soft-pool algorithm operating on the channel axis. We evaluate our model on three public building datasets. The experimental results show that SAEViT obtains a significant improvement on various datasets, confirming its efficacy. Compared with several state-of-the-art models, SAEViT achieves a comprehensive transcendence in overall performance.
引用
收藏
页码:1250 / 1276
页数:27
相关论文
共 50 条
  • [21] Building Extraction With Vision Transformer
    Wang, Libo
    Fang, Shenghui
    Meng, Xiaoliang
    Li, Rui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [22] Efficient Inductive Vision Transformer for Oriented Object Detection in Remote Sensing Imagery
    Zhang, Cong
    Su, Jingran
    Ju, Yakun
    Lam, Kin-Man
    Wang, Qi
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [23] MSHFormer: A Multiscale Hybrid Transformer Network With Boundary Enhancement for VHR Remote Sensing Image Building Extraction
    Zhu, Panpan
    Song, Zhichao
    Liu, Jiale
    Yan, Jiazheng
    Luo, Xiaobo
    Tao, Yuxiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [24] Boundary-Assisted Learning for Building Extraction from Optical Remote Sensing Imagery
    He, Sheng
    Jiang, Wanshou
    REMOTE SENSING, 2021, 13 (04) : 1 - 18
  • [25] A Review of Building Extraction From Remote Sensing Imagery: Geometrical Structures and Semantic Attributes
    Li, Qingyu
    Mou, Lichao
    Sun, Yao
    Hua, Yuansheng
    Shi, Yilei
    Zhu, Xiao Xiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [26] A Multiscale Segmentation Framework for Uncompleted Building Footprint Extraction from Remote Sensing Imagery
    Bello, Inuwa Mamuda
    Zhang, Ke
    Wang, Jingyu
    Aslam, Muhammad Azeem
    2021 IEEE ASIA-PACIFIC CONFERENCE ON GEOSCIENCE, ELECTRONICS AND REMOTE SENSING TECHNOLOGY (AGERS-2021), 2021, : 119 - 124
  • [27] Semisupervised Building Instance Extraction From High-Resolution Remote Sensing Imagery
    Fang, Fang
    Xu, Rui
    Li, Shengwen
    Hao, Qingyi
    Zheng, Kang
    Wu, Kaishun
    Wan, Bo
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [28] Building Extraction From Remote Sensing Imagery With a High-Resolution Capsule Network
    Yu, Yongtao
    Liu, Chao
    Gao, Junyong
    Jin, Shenghua
    Jiang, Xiaoling
    Jiang, Mingxin
    Zhang, Haiyan
    Zhang, Yahong
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [29] Asymmetric Network Combining CNN and Transformer for Building Extraction from Remote Sensing Images
    Chang, Junhao
    Cen, Yuefeng
    Cen, Gang
    SENSORS, 2024, 24 (19)
  • [30] Shape-aware Medical Image Enhancement by Weighted Total Variation
    Yuzuriha, Ryota
    Okuda, Masahiro
    2018 IEEE 7TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE 2018), 2018, : 537 - 540