SSformer: A Lightweight Transformer for Semantic Segmentation

被引:24
|
作者
Shi, Wentao [1 ]
Xu, Jing [1 ]
Gao, Pan [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China
关键词
Image Segmentation; Transformer; Multilayer perceptron; Lightweight model;
D O I
10.1109/MMSP55362.2022.9949177
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
It is well believed that Transformer performs better in semantic segmentation compared to convolutional neural networks. Nevertheless, the original Vision Transformer [2] may lack of inductive biases of local neighborhoods and possess a high time complexity. Recently, Swin Transformer [3] sets a new record in various vision tasks by using hierarchical architecture and shifted windows while being more efficient. However, as Swin Transformer is specifically designed for image classification, it may achieve suboptimal performance on dense prediction-based segmentation task. Further, simply combing Swin Transformer with existing methods would lead to the boost of model size and parameters for the final segmentation model. In this paper, we rethink the Swin Transformer for semantic segmentation, and design a lightweight yet effective transformer model, called SSformer. In this model, considering the inherent hierarchical design of Swin Transformer, we propose a decoder to aggregate information from different layers, thus obtaining both local and global attentions. Experimental results show the proposed SSformer yields comparable mIoU performance with state-of-the-art models, while maintaining a smaller model size and lower compute. Source code and pretrained models are available at: https://github.com/shiwt03/SSformer
引用
收藏
页数:5
相关论文
共 50 条
  • [11] Lightweight convolutional neural networks with context broadcast transformer for real-time semantic segmentation
    Hu, Kaidi
    Xie, Zongxia
    Hu, Qinghua
    IMAGE AND VISION COMPUTING, 2024, 146
  • [12] Semantic segmentation of terrace image regions based on lightweight CNN-Transformer hybrid networks
    Liu X.
    Yi S.
    Li L.
    Cheng X.
    Wang C.
    Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2023, 39 (13): : 171 - 181
  • [13] A lightweight network for smoke semantic segmentation
    Yuan, Feiniu
    Li, Kang
    Wang, Chunmei
    Fang, Zhijun
    PATTERN RECOGNITION, 2023, 137
  • [14] Eye Semantic Segmentation with A Lightweight Model
    Huynh, Van Thong
    Kim, Soo-Hyung
    Lee, Guee-Sang
    Yang, Hyung-Jeong
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3694 - 3697
  • [15] Transformer Scale Gate for Semantic Segmentation
    Shi, Hengcan
    Hayat, Munawar
    Cai, Jianfei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3051 - 3060
  • [16] TransRVNet: LiDAR Semantic Segmentation With Transformer
    Cheng, Hui-Xian
    Han, Xian-Feng
    Xiao, Guo-Qiang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (06) : 5895 - 5907
  • [17] Pyramid Fusion Transformer for Semantic Segmentation
    Qin, Zipeng
    Liu, Jianbo
    Zhang, Xiaolin
    Tian, Maoqing
    Zhou, Aojun
    Yi, Shuai
    Li, Hongsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9630 - 9643
  • [18] AISOA-SSformer: An Effective Image Segmentation Method for Rice Leaf Disease Based on the Transformer Architecture
    Dai, Weisi
    Zhu, Wenke
    Zhou, Guoxiong
    Liu, Genhua
    Xu, Jiaxin
    Zhou, Hongliang
    Hu, Yahui
    Liu, Zewei
    Li, Jinyang
    Li, Liujun
    PLANT PHENOMICS, 2024, 6
  • [19] Tunnel crack segmentation based on lightweight Transformer
    Kuang, Xianyan
    Xu, Yaoming
    Lei, Hui
    Cheng, Fujun
    Huan, Xianglan
    Journal of Railway Science and Engineering, 2024, 21 (08) : 3421 - 3433
  • [20] Light4Mars: A lightweight transformer model for semantic segmentation on unstructured environment like Mars
    Xiong, Yonggang
    Xiao, Xueming
    Yao, Meibao
    Cui, Hutao
    Fu, Yuegang
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2024, 214 : 167 - 178