Transformer-Based Semantic Segmentation for Extraction of Building Footprints from Very-High-Resolution Images

被引:6
|
作者
Song, Jia [1 ,3 ]
Zhu, A-Xing [1 ,2 ]
Zhu, Yunqiang [1 ]
机构
[1] Chinese Acad Sci, Inst Geog Sci & Nat Resources Res, State Key Lab Resources & Environm Informat Syst, Beijing 100101, Peoples R China
[2] Univ Wisconsin, Dept Geog, Madison, WI 53706 USA
[3] Jiangsu Ctr Collaborat Innovat Geog Informat Resou, Nanjing 210023, Peoples R China
关键词
vision transformer; hyperparameter; building; self-attention; deep learning; CLASSIFICATION; NETWORK;
D O I
10.3390/s23115166
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Semantic segmentation with deep learning networks has become an important approach to the extraction of objects from very high-resolution remote sensing images. Vision Transformer networks have shown significant improvements in performance compared to traditional convolutional neural networks (CNNs) in semantic segmentation. Vision Transformer networks have different architectures to CNNs. Image patches, linear embedding, and multi-head self-attention (MHSA) are several of the main hyperparameters. How we should configure them for the extraction of objects in VHR images and how they affect the accuracy of networks are topics that have not been sufficiently investigated. This article explores the role of vision Transformer networks in the extraction of building footprints from very-high-resolution (VHR) images. Transformer-based models with different hyperparameter values were designed and compared, and their impact on accuracy was analyzed. The results show that smaller image patches and higher-dimension embeddings result in better accuracy. In addition, the Transformer-based network is shown to be scalable and can be trained with general-scale graphics processing units (GPUs) with comparable model sizes and training times to convolutional neural networks while achieving higher accuracy. The study provides valuable insights into the potential of vision Transformer networks in object extraction using VHR images.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] Building Footprint Extraction from Very-High-Resolution Satellite Image Using Object-Based Image Analysis (OBIA) Technique
    Prathiba, A. P.
    Rastogi, Kriti
    Jain, Gaurav, V
    Kumar, V. V. Govind
    APPLICATIONS OF GEOMATICS IN CIVIL ENGINEERING, 2020, 33 : 517 - 529
  • [42] Road Extraction from Very-High-Resolution Remote Sensing Images via a Nested SE-Deeplab Model
    Lin, Yeneng
    Xu, Dongyun
    Wang, Nan
    Shi, Zhou
    Chen, Qiuxiao
    REMOTE SENSING, 2020, 12 (18)
  • [43] Contextually guided very-high-resolution imagery classification with semantic segments
    Zhao, Wenzhi
    Du, Shihong
    Wang, Qiao
    Emery, William J.
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2017, 132 : 48 - 60
  • [44] Extraction of road blockage information for the Jiuzhaigou earthquake based on a convolution neural network and very-high-resolution satellite images
    Baolin Yang
    Shixin Wang
    Yi Zhou
    Futao Wang
    Qiao Hu
    Ying Chang
    Qing Zhao
    Earth Science Informatics, 2020, 13 : 115 - 127
  • [45] A Transformer-based multi-modal fusion network for semantic segmentation of high-resolution remote sensing imagery
    Liu, Yutong
    Gao, Kun
    Wang, Hong
    Yang, Zhijia
    Wang, Pengyu
    Ji, Shijing
    Huang, Yanjun
    Zhu, Zhenyu
    Zhao, Xiaobin
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 133
  • [46] TRANSFORMER-BASED METHOD FOR SEMANTIC SEGMENTATION AND RECONSTRUCTION OF THE MARTIAN SURFACE
    Li, Z.
    Wu, B.
    Chen, Z.
    Ma, Y.
    GEOSPATIAL WEEK 2023, VOL. 48-1, 2023, : 1643 - 1649
  • [47] SEMANTIC SEGMENTATION OF HIGH-RESOLUTION REMOTE SENSING IMAGES USING AN IMPROVED TRANSFORMER
    Liu, Yuheng
    Mei, Shaohui
    Zhang, Shun
    Wang, Ye
    He, Mingyi
    Du, Qian
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 3496 - 3499
  • [48] Very-High-Resolution SAR Images and Linked Open Data Analytics Based on Ontologies
    Espinoza-Molina, Daniela
    Nikolaou, Charalampos
    Dumitru, Corneliu Octavian
    Bereta, Konstantina
    Koubarakis, Manolis
    Schwarz, Gottfried
    Datcu, Mihai
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2015, 8 (04) : 1696 - 1708
  • [49] Transformer-based Detection of Microorganisms on High-Resolution Petri Dish Images
    Ebert, Nikolas
    Stricker, Didier
    Wasenmueller, Oliver
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 3963 - 3972
  • [50] A Weakly Supervised Semantic Segmentation Approach for Damaged Building Extraction From Postearthquake High-Resolution Remote-Sensing Images
    Qiao, Wenfan
    Shen, Li
    Wang, Jicheng
    Yang, Xiaotian
    Li, Zhilin
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20