Semantic segmentation using cross-stage feature reweighting and efficient self-attention

被引:0
|
作者
Ma, Yingdong [1 ]
Lan, Xiaobin [1 ]
机构
[1] Inner Mongolia Univ, Coll Comp Sci, 235 West Daxue Rd, Hohhot, Peoples R China
关键词
Semantic segmentation; Convolutional neural networks; Transformer; Feature fusion and reweighting; NETWORK;
D O I
10.1016/j.imavis.2024.104996
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, vision transformers have demonstrated strong performance in various computer vision tasks. The success of ViTs can be attribute to the ability of capturing long-range dependencies. However, transformer-based approaches often yield segmentation maps with incomplete object structures because of restricted cross-stage information propagation and lack of low-level details. To address these problems, we introduce a CNNtransformer semantic segmentation architecture which adopts a CNN backbone for multi-level feature extraction and a transformer encoder that focuses on global perception learning. Transformer embeddings of all stages are integrated to compute feature weights for dynamic cross-stage feature reweighting. As a result, high-level semantic context and low-level spatial details can be embedded into each stage to preserve multi-level information. An efficient attention-based feature fusion mechanism is developed to combine reweighted transformer embeddings with CNN features to generate segmentation maps with more complete object structure. Different from regular self-attention that has quadratic computational complexity, our efficient self-attention method achieves similar performance with linear complexity. Experimental results on ADE20K and Cityscapes datasets show that the proposed segmentation approach demonstrates superior performance against most state-of-the-art networks.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Stable self-attention adversarial learning for semi-supervised semantic image segmentation
    Zhang, Jia
    Li, Zhixin
    Zhang, Canlong
    Ma, Huifang
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 78
  • [42] Cross-stage recurrent feature sharing network for video dehazing
    Galshetwar, Vijay M.
    Saini, Poonam
    Chaudhary, Sachin
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 241
  • [43] Deep Semantic Role Labeling with Self-Attention
    Tan, Zhixing
    Wang, Mingxuan
    Xie, Jun
    Chen, Yidong
    Shi, Xiaodong
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 4929 - 4936
  • [44] Surface defect segmentation of magnetic tiles based on cross self-attention module
    Liu, Hong
    Wang, Gaihua
    Li, Qi
    Wang, Nengyuan
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (06) : 9523 - 9532
  • [45] Cross-Modal Prostate Cancer Segmentation via Self-Attention Distillation
    Zhang, Guokai
    Shen, Xiaoang
    Zhang, Yu-Dong
    Luo, Ye
    Luo, Jihao
    Zhu, Dandan
    Yang, Hanmei
    Wang, Weigang
    Zhao, Binghui
    Lu, Jianwei
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (11) : 5298 - 5309
  • [46] Referring Segmentation in Images and Videos With Cross-Modal Self-Attention Network
    Ye, Linwei
    Rochan, Mrigank
    Liu, Zhi
    Zhang, Xiaoqin
    Wang, Yang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3719 - 3732
  • [47] Complex Scene Segmentation With Local to Global Self-Attention Module and Feature Alignment Module
    Ou, Xianfeng
    Wang, Hanpu
    Liu, Xinzhong
    Zheng, Jun
    Liu, Zhihao
    Tan, Shulun
    Zhou, Hongzhi
    IEEE ACCESS, 2023, 11 : 96530 - 96542
  • [48] Ensemble cross-stage partial attention network for image classification
    Lin, Hai
    Yang, JunJie
    IET IMAGE PROCESSING, 2022, 16 (01) : 102 - 112
  • [49] CROSS ATTENTION NETWORK FOR SEMANTIC SEGMENTATION
    Liu, Mengyu
    Yin, Hujun
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 2434 - 2438
  • [50] Feature Importance Estimation with Self-Attention Networks
    Skrlj, Blaz
    Dzeroski, Saso
    Lavrac, Nada
    Petkovic, Matej
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 1491 - 1498