MSST-Net: A Multi-Scale Adaptive Network for Building Extraction from Remote Sensing Images Based on Swin Transformer

被引：51

作者：

Yuan, Wei ^{[1
,2
]}

Xu, Wenbo ^{[3
]}

机构：

[1] Chengdu Univ, Sch Architecture & Civil Engn, Chengdu 610106, Peoples R China

[2] Chengdu Univ, Inst Higher Educ Sichuan Prov, Key Lab Pattern Recognit & Intelligent Informat P, Chengdu 610106, Peoples R China

[3] Univ Elect Sci & Technol China, Sch Resources & Environm, Chengdu 611731, Peoples R China

来源：

REMOTE SENSING | 2021年 / 13卷 / 23期

关键词：

deep learning; remote sensing; transformer; semantic segmentation; multi-scale adaptive; SEGMENTATION;

D O I：

10.3390/rs13234743

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

The segmentation of remote sensing images by deep learning technology is the main method for remote sensing image interpretation. However, the segmentation model based on a convolutional neural network cannot capture the global features very well. A transformer, whose self-attention mechanism can supply each pixel with a global feature, makes up for the deficiency of the convolutional neural network. Therefore, a multi-scale adaptive segmentation network model (MSST-Net) based on a Swin Transformer is proposed in this paper. Firstly, a Swin Transformer is used as the backbone to encode the input image. Then, the feature maps of different levels are decoded separately. Thirdly, the convolution is used for fusion, so that the network can automatically learn the weight of the decoding results of each level. Finally, we adjust the channels to obtain the final prediction map by using the convolution with a kernel of 1 x 1. By comparing this with other segmentation network models on a WHU building data set, the evaluation metrics, mIoU, F1-score and accuracy are all improved. The network model proposed in this paper is a multi-scale adaptive network model that pays more attention to the global features for remote sensing segmentation.

引用

页数：14

共 50 条

[41] Transformer-based multi-scale feature fusion network for remote sensing change detection
Liang, Shike
Hua, Zhen
Li, Jinjiang
JOURNAL OF APPLIED REMOTE SENSING, 2022, 16 (04)
[42] ER-Swin: Feature Enhancement and Refinement Network Based on Swin Transformer for Semantic Segmentation of Remote Sensing Images
Liu, Jiang
Cheng, Shuli
Du, Anyu
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
[43] Multi-scale detail enhancement network for remote sensing road extraction
Geng, Tingting
Cao, Yuan
Wang, Changqing
EARTH SCIENCE INFORMATICS, 2025, 18 (03)
[44] Road Extraction from Remote Sensing Imagery with Spatial Attention Based on Swin Transformer
Zhu, Xianhong
Huang, Xiaohui
Cao, Weijia
Yang, Xiaofei
Zhou, Yunfei
Wang, Shaokai
REMOTE SENSING, 2024, 16 (07)
[45] Multi-scale Contrastive Learning for Building Change Detection in Remote Sensing Images
Xue, Mingliang
Huo, Xinyuan
Lu, Yao
Niu, Pengyuan
Liang, Xuan
Shang, Hailong
Jia, Shucai
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IV, 2024, 14428 : 318 - 329
[46] Object Detection in Remote Sensing Images Based on Adaptive Multi-Scale Feature Fusion Method
Liu, Chun
Zhang, Sixuan
Hu, Mengjie
Song, Qing
REMOTE SENSING, 2024, 16 (05)
[47] Adaptive Anchor Networks for Multi-Scale Object Detection in Remote Sensing Images
Zhang, Miaohui
Chen, Yunzhong
Liu, Xianxing
Lv, Bingxue
Wang, Jun
IEEE ACCESS, 2020, 8 : 57552 - 57565
[48] AFL-Net: Attentional Feature Learning Network for Building Extraction from Remote Sensing Images
Qiu, Yue
Wu, Fang
Qian, Haizhong
Zhai, Renjian
Gong, Xianyong
Yin, Jichong
Liu, Chengyi
Wang, Andong
REMOTE SENSING, 2023, 15 (01)
[49] Aircraft segmentation in remote sensing images based on multi-scale residual U-Net with attention
Xuqi Wang
Shanwen Zhang
Lei Huang
Multimedia Tools and Applications, 2024, 83 : 17855 - 17872
[50] MSCSA-Net: Multi-Scale Channel Spatial Attention Network for Semantic Segmentation of Remote Sensing Images
Liu, Kuan-Hsien
Lin, Bo-Yen
APPLIED SCIENCES-BASEL, 2023, 13 (17):

← 1 2 3 4 5 →