Transformer-Based Semantic Segmentation for Extraction of Building Footprints from Very-High-Resolution Images

被引:6
|
作者
Song, Jia [1 ,3 ]
Zhu, A-Xing [1 ,2 ]
Zhu, Yunqiang [1 ]
机构
[1] Chinese Acad Sci, Inst Geog Sci & Nat Resources Res, State Key Lab Resources & Environm Informat Syst, Beijing 100101, Peoples R China
[2] Univ Wisconsin, Dept Geog, Madison, WI 53706 USA
[3] Jiangsu Ctr Collaborat Innovat Geog Informat Resou, Nanjing 210023, Peoples R China
关键词
vision transformer; hyperparameter; building; self-attention; deep learning; CLASSIFICATION; NETWORK;
D O I
10.3390/s23115166
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Semantic segmentation with deep learning networks has become an important approach to the extraction of objects from very high-resolution remote sensing images. Vision Transformer networks have shown significant improvements in performance compared to traditional convolutional neural networks (CNNs) in semantic segmentation. Vision Transformer networks have different architectures to CNNs. Image patches, linear embedding, and multi-head self-attention (MHSA) are several of the main hyperparameters. How we should configure them for the extraction of objects in VHR images and how they affect the accuracy of networks are topics that have not been sufficiently investigated. This article explores the role of vision Transformer networks in the extraction of building footprints from very-high-resolution (VHR) images. Transformer-based models with different hyperparameter values were designed and compared, and their impact on accuracy was analyzed. The results show that smaller image patches and higher-dimension embeddings result in better accuracy. In addition, the Transformer-based network is shown to be scalable and can be trained with general-scale graphics processing units (GPUs) with comparable model sizes and training times to convolutional neural networks while achieving higher accuracy. The study provides valuable insights into the potential of vision Transformer networks in object extraction using VHR images.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Symmetrical Dense-Shortcut Deep Fully Convolutional Networks for Semantic Segmentation of Very-High-Resolution Remote Sensing Images
    Chen, Guanzhou
    Zhang, Xiaodong
    Wang, Qing
    Dai, Fan
    Gong, Yuanfu
    Zhu, Kun
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2018, 11 (05) : 1633 - 1644
  • [22] Context-Enabled Extraction of Large-Scale Urban Functional Zones from Very-High-Resolution Images: A Multiscale Segmentation Approach
    Du, Shouji
    Du, Shihong
    Liu, Bo
    Zhang, Xiuyuan
    REMOTE SENSING, 2019, 11 (16)
  • [23] Detecting Building Changes Using Multimodal Siamese Multitask Networks From Very-High-Resolution Satellite Images
    Li, Mengmeng
    Liu, Xuanguang
    Wang, Xiaoqin
    Xiao, Pengfeng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [24] Transformer-Based Semantic Segmentation for Recycling Materials in Construction
    Wang, Xin
    Han, Wei
    Mo, Sicheng
    Cai, Ting
    Gong, Yijing
    Li, Yin
    Zhu, Zhenhua
    COMPUTING IN CIVIL ENGINEERING 2023-DATA, SENSING, AND ANALYTICS, 2024, : 25 - 33
  • [25] Federated Deep Learning With Prototype Matching for Object Extraction From Very-High-Resolution Remote Sensing Images
    Zhang, Xiaokang
    Zhang, Boning
    Yu, Weikang
    Kang, Xudong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [26] Evaluating Transformer-based Semantic Segmentation Networks for Pathological Image Segmentation
    Cam Nguyen
    Asad, Zuhayr
    Deng, Ruining
    Huo, Yuankai
    MEDICAL IMAGING 2022: IMAGE PROCESSING, 2022, 12032
  • [27] Extraction of Rural Residential Land from Very-High Resolution UAV Images Using a Novel Semantic Segmentation Framework
    Sha, Chenggao
    Liu, Jian
    Wang, Lan
    Shan, Bowen
    Hou, Yaxian
    Wang, Ailing
    SUSTAINABILITY, 2022, 14 (19)
  • [28] Federated Deep Learning With Prototype Matching for Object Extraction From Very-High-Resolution Remote Sensing Images
    Zhang, Xiaokang
    Zhang, Boning
    Yu, Weikang
    Kang, Xudong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [29] AN IMPROVED DEEP-LEARNING MODEL FOR ROAD EXTRACTION FROM VERY-HIGH-RESOLUTION REMOTE SENSING IMAGES
    Shen, Wangyao
    Chen, Yunping
    Cheng, Yuanlei
    Yang, Kangzhuo
    Guo, Xiang
    Sung, Yuan
    Chen, Yan
    2021 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM IGARSS, 2021, : 4660 - 4663
  • [30] Shadow Pattern-Enhanced Building Height Extraction Using Very-High-Resolution Image
    Zhou, Xiran
    Myint, Soe W.
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 180 - 190