SSformer: A Lightweight Transformer for Semantic Segmentation

被引：24

作者：

Shi, Wentao ^{[1
]}

Xu, Jing ^{[1
]}

Gao, Pan ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China

来源：

2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2022年

关键词：

Image Segmentation; Transformer; Multilayer perceptron; Lightweight model;

D O I：

10.1109/MMSP55362.2022.9949177

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

It is well believed that Transformer performs better in semantic segmentation compared to convolutional neural networks. Nevertheless, the original Vision Transformer [2] may lack of inductive biases of local neighborhoods and possess a high time complexity. Recently, Swin Transformer [3] sets a new record in various vision tasks by using hierarchical architecture and shifted windows while being more efficient. However, as Swin Transformer is specifically designed for image classification, it may achieve suboptimal performance on dense prediction-based segmentation task. Further, simply combing Swin Transformer with existing methods would lead to the boost of model size and parameters for the final segmentation model. In this paper, we rethink the Swin Transformer for semantic segmentation, and design a lightweight yet effective transformer model, called SSformer. In this model, considering the inherent hierarchical design of Swin Transformer, we propose a decoder to aggregate information from different layers, thus obtaining both local and global attentions. Experimental results show the proposed SSformer yields comparable mIoU performance with state-of-the-art models, while maintaining a smaller model size and lower compute. Source code and pretrained models are available at: https://github.com/shiwt03/SSformer

引用

页数：5

共 50 条

[11] Lightweight convolutional neural networks with context broadcast transformer for real-time semantic segmentation
Hu, Kaidi
Xie, Zongxia
Hu, Qinghua
IMAGE AND VISION COMPUTING, 2024, 146
[12] Semantic segmentation of terrace image regions based on lightweight CNN-Transformer hybrid networks
Liu X.
Yi S.
Li L.
Cheng X.
Wang C.
Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2023, 39 (13): : 171 - 181
[13] A lightweight network for smoke semantic segmentation
Yuan, Feiniu
Li, Kang
Wang, Chunmei
Fang, Zhijun
PATTERN RECOGNITION, 2023, 137
[14] Eye Semantic Segmentation with A Lightweight Model
Huynh, Van Thong
Kim, Soo-Hyung
Lee, Guee-Sang
Yang, Hyung-Jeong
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3694 - 3697
[15] Transformer Scale Gate for Semantic Segmentation
Shi, Hengcan
Hayat, Munawar
Cai, Jianfei
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3051 - 3060
[16] TransRVNet: LiDAR Semantic Segmentation With Transformer
Cheng, Hui-Xian
Han, Xian-Feng
Xiao, Guo-Qiang
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (06) : 5895 - 5907
[17] Pyramid Fusion Transformer for Semantic Segmentation
Qin, Zipeng
Liu, Jianbo
Zhang, Xiaolin
Tian, Maoqing
Zhou, Aojun
Yi, Shuai
Li, Hongsheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9630 - 9643
[18] AISOA-SSformer: An Effective Image Segmentation Method for Rice Leaf Disease Based on the Transformer Architecture
Dai, Weisi
Zhu, Wenke
Zhou, Guoxiong
Liu, Genhua
Xu, Jiaxin
Zhou, Hongliang
Hu, Yahui
Liu, Zewei
Li, Jinyang
Li, Liujun
PLANT PHENOMICS, 2024, 6
[19] Tunnel crack segmentation based on lightweight Transformer
Kuang, Xianyan
Xu, Yaoming
Lei, Hui
Cheng, Fujun
Huan, Xianglan
Journal of Railway Science and Engineering, 2024, 21 (08) : 3421 - 3433
[20] Light4Mars: A lightweight transformer model for semantic segmentation on unstructured environment like Mars
Xiong, Yonggang
Xiao, Xueming
Yao, Meibao
Cui, Hutao
Fu, Yuegang
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2024, 214 : 167 - 178

← 1 2 3 4 5 →