Efficient Attention Pyramid Network for Semantic Segmentation

被引:8
|
作者
Yang, Qirui [1 ,2 ,3 ]
Ku, Tao [1 ,2 ]
Hu, Kunyuan [1 ,2 ]
机构
[1] Chinese Acad Sci, Shenyang Inst Automat, Shenyang 110016, Peoples R China
[2] Chinese Acad Sci, Inst Robot & Intelligent Mfg, Shenyang 110169, Peoples R China
[3] Univ Chinese Acad Sci, Sch Comp & Control, Beijing 100049, Peoples R China
关键词
Semantics; Convolution; Feature extraction; Task analysis; Image segmentation; Decoding; Computer vision; Semantic segmentation; attention mechanism; spatial pyramid; PASCAL VOC 2012; Cityscapes;
D O I
10.1109/ACCESS.2021.3053316
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semantic segmentation is a task that covers most of the perception needs of intelligent vehicles in an unified way. Recent studies witnessed that attention mechanisms achieve impressive performance in computer vision task. Current attention mechanisms based segmentation methods differ with each other in position and form of the attention mechanism, and perform differently in practice. This paper firstly introduces the effectiveness of multi-scale context features and attention mechanisms in segmentation tasks. We find that multi-scale and channel attention can play a vital role in constructing effective context features. Based on this analysis, this paper proposes an efficient attention pyramid network (EAPNet) for semantic segmentation. Specifically, to efficient handle the problem of segmenting objects at multiple scales, we design efficient channel attention pyramid (ECAP) which employ atrous convolution with channel attention in cascade or in parallel to capture multi-scale context by using multiple atrous rates. Furthermore, we propose a residual attention fusion block (RAFB), whose purpose is to simultaneously focus on meaningful low-level feature maps and spatial location information. At the same time, we will explore different channel attention modules and spatial attention modules, and describe their impact on network performance. We empirically evaluate our EAPNet on two semantic segmentation datasets, including PASCAL VOC 2012 and Cityscapes datasets. Experimental results show that without MS COCO pre-training and any post-processing, EAPNet achieved 81.7% mIoU on the PASCAL VOC 2012 validation set. With deeplabv3+ as the benchmark, EAPNet improve the model performance of more than 1.50% mIoU.
引用
收藏
页码:18867 / 18875
页数:9
相关论文
共 50 条
  • [21] Dynamic attention network for semantic segmentation
    Wu, Fei
    Chen, Feng
    Jing, Xiao-Yuan
    Hu, Chang-Hui
    Ge, Qi
    Ji, Yimu
    NEUROCOMPUTING, 2020, 384 (384) : 182 - 191
  • [22] Enhanced-feature pyramid network for semantic segmentation
    Quyen, Van Toan
    Lee, Jong Hyuk
    Kim, Min Young
    2023 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION, ICAIIC, 2023, : 782 - 787
  • [23] Aggregated pyramid attention network for mass segmentation in mammograms
    Meng Lou
    Yunliang Qi
    Xiaorong Li
    Chunbo Xu
    Wenwei Zhao
    Xiangyu Deng
    Yide Ma
    Multimedia Tools and Applications, 2022, 81 : 13335 - 13353
  • [24] Pyramid Predictive Attention Network for Medical Image Segmentation
    Yang, Tingxiao
    Yoshimura, Yuichiro
    Morita, Akira
    Namiki, Takao
    Nakaguchi, Toshiya
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2019, E102A (09) : 1225 - 1234
  • [25] Aggregated pyramid attention network for mass segmentation in mammograms
    Lou, Meng
    Qi, Yunliang
    Li, Xiaorong
    Xu, Chunbo
    Zhao, Wenwei
    Deng, Xiangyu
    Ma, Yide
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (10) : 13335 - 13353
  • [26] Cross-form efficient attention pyramidal network for semantic image segmentation
    Maurya, Anamika
    Chand, Satish
    AI COMMUNICATIONS, 2022, 35 (03) : 225 - 242
  • [27] ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
    Mehta, Sachin
    Rastegari, Mohammad
    Caspi, Anat
    Shapiro, Linda
    Hajishirzi, Hannaneh
    COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 : 561 - 580
  • [28] Efficient pyramid context encoding and feature embedding for semantic segmentation
    Liu, Mengyu
    Yin, Hujun
    IMAGE AND VISION COMPUTING, 2021, 111
  • [29] MSPAN: Multi-scale pyramid attention network for efficient skin cancer lesion segmentation
    Ahmed, Noor
    Xin, Tan
    Lizhuang, Ma
    IET IMAGE PROCESSING, 2024, 18 (07) : 1667 - 1680
  • [30] Hybrid Feature based Pyramid Network for Nighttime Semantic Segmentation
    Li, Yuqi
    Ma, Yinan
    Wu, Jing
    Long, Chengnian
    VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 4: VISAPP, 2021, : 321 - 328