Semantic segmentation using cross-stage feature reweighting and efficient self-attention

被引:0
|
作者
Ma, Yingdong [1 ]
Lan, Xiaobin [1 ]
机构
[1] Inner Mongolia Univ, Coll Comp Sci, 235 West Daxue Rd, Hohhot, Peoples R China
关键词
Semantic segmentation; Convolutional neural networks; Transformer; Feature fusion and reweighting; NETWORK;
D O I
10.1016/j.imavis.2024.104996
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, vision transformers have demonstrated strong performance in various computer vision tasks. The success of ViTs can be attribute to the ability of capturing long-range dependencies. However, transformer-based approaches often yield segmentation maps with incomplete object structures because of restricted cross-stage information propagation and lack of low-level details. To address these problems, we introduce a CNNtransformer semantic segmentation architecture which adopts a CNN backbone for multi-level feature extraction and a transformer encoder that focuses on global perception learning. Transformer embeddings of all stages are integrated to compute feature weights for dynamic cross-stage feature reweighting. As a result, high-level semantic context and low-level spatial details can be embedded into each stage to preserve multi-level information. An efficient attention-based feature fusion mechanism is developed to combine reweighted transformer embeddings with CNN features to generate segmentation maps with more complete object structure. Different from regular self-attention that has quadratic computational complexity, our efficient self-attention method achieves similar performance with linear complexity. Experimental results on ADE20K and Cityscapes datasets show that the proposed segmentation approach demonstrates superior performance against most state-of-the-art networks.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Cross-stage feature fusion and efficient self-attention for salient object detection
    Xia, Xiaofeng
    Ma, Yingdong
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 104
  • [2] Self-attention feature fusion network for semantic segmentation
    Zhou, Zhen
    Zhou, Yan
    Wang, Dongli
    Mu, Jinzhen
    Zhou, Haibin
    NEUROCOMPUTING, 2021, 453 : 50 - 59
  • [3] Efficient Semantic Segmentation via Self-Attention and Self-Distillation
    An, Shumin
    Liao, Qingmin
    Lu, Zongqing
    Xue, Jing-Hao
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (09) : 15256 - 15266
  • [4] Pyramid Self-attention for Semantic Segmentation
    Qi, Jiyang
    Wang, Xinggang
    Hu, Yao
    Tang, Xu
    Liu, Wenyu
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, 2021, 13019 : 480 - 492
  • [5] Lightweight Self-Attention Network for Semantic Segmentation
    Zhou, Yan
    Zhou, Haibin
    Li, Nanjun
    Li, Jianxun
    Wang, Dongli
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [6] FsaNet: Frequency Self-Attention for Semantic Segmentation
    Zhang, Fengyu
    Panahi, Ashkan
    Gao, Guangjun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4757 - 4772
  • [7] Lunet: an enhanced upsampling fusion network with efficient self-attention for semantic segmentation
    Zhou, Yan
    Zhou, Haibin
    Yang, Yin
    Li, Jianxun
    Irampaye, Richard
    Wang, Dongli
    Zhang, Zhengpeng
    VISUAL COMPUTER, 2024, : 3109 - 3128
  • [8] SATS: Self-attention transfer for continual semantic segmentation
    Qiu, Yiqiao
    Shen, Yixing
    Sun, Zhuohao
    Zheng, Yanchong
    Chang, Xiaobin
    Zheng, Weishi
    Wang, Ruixuan
    PATTERN RECOGNITION, 2023, 138
  • [9] A sketch semantic segmentation method using novel local feature aggregation and segment-level self-attention
    Wang, Lei
    Zhang, Shihui
    Wang, Wei
    Zhao, Weibo
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (21): : 15295 - 15313
  • [10] A sketch semantic segmentation method using novel local feature aggregation and segment-level self-attention
    Lei Wang
    Shihui Zhang
    Wei Wang
    Weibo Zhao
    Neural Computing and Applications, 2023, 35 : 15295 - 15313