Semantic segmentation using stride spatial pyramid pooling and dual attention decoder

被引:62
|
作者
Peng, Chengli [1 ]
Ma, Jiayi [1 ]
机构
[1] Wuhan Univ, Elect Informat Sch, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantic segmentation; Convolutional neural networks; Pyramid pooling; Attention mechanism; NETWORKS; FORCE;
D O I
10.1016/j.patcog.2020.107498
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic segmentation is an end-to-end task that requires both semantic and spatial accuracy. It is important for deep learning-based segmentation methods to effectively utilize the high-level feature map whose semantic information is abundant and the low-level feature map whose spatial information is accurate. However, existing segmentation networks typically cannot take full advantage of these two kinds of feature maps, leading to inferior performance. This paper attempts to overcome this challenge by introducing two novel structures. On the one hand, we propose a structure called stride spatial pyramid pooling (SSPP) to capture multiscale semantic information from the high-level feature map. Compared with existing pyramid pooling methods based on the atrous convolution, the SSPP structure is able to gather more information from the high-level feature map with faster inference speed, which improves the utilization efficiency of the high-level feature map significantly. On the other hand, we propose a dual attention decoder consisting of a channel attention branch and a spatial attention branch to make full use of the high- and low-level feature maps simultaneously. The dual attention decoder can result in a more "semantic" low-level feature map and a high-level feature map with more accurate spatial information, which bridges the gap between these two kinds of feature maps and benefits their fusion. We evaluate the proposed model on several publicly available semantic image segmentation benchmarks including PASCAL VOC 2012, Cityscapes and COCO-Stuff. The qualitative and quantitative results demonstrate that our method can achieve the state-of-the-art performance. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [11] AtICNet: semantic segmentation with atrous spatial pyramid pooling in image cascade network
    Chen, Jin
    Wang, Chuanya
    Tong, Ying
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2019, 2019 (1)
  • [12] AtICNet: semantic segmentation with atrous spatial pyramid pooling in image cascade network
    Jin Chen
    Chuanya Wang
    Ying Tong
    EURASIP Journal on Wireless Communications and Networking, 2019
  • [13] Scale-aware spatial pyramid pooling with both encoder-mask and scale-attention for semantic segmentation
    Zhou, Feng
    Hu, Yong
    Shen, Xukun
    NEUROCOMPUTING, 2020, 383 (383) : 174 - 182
  • [14] Encoder-decoder with dense dilated spatial pyramid pooling for prostate MR images segmentation
    Geng, Lei
    Wang, Jia
    Xiao, Zhitao
    Tong, Jun
    Zhang, Fang
    Wu, Jun
    COMPUTER ASSISTED SURGERY, 2019, 24 : 13 - 19
  • [15] Encoder–Decoder Network with Depthwise Atrous Spatial Pyramid Pooling for Automatic Brain Tumor Segmentation
    Nagwa M. AboElenein
    Songhao Piao
    Zhehong Zhang
    Neural Processing Letters, 2023, 55 : 1697 - 1713
  • [16] Semantic Segmentation Method of Autonomous Driving Images Based on Atrous Spatial Pyramid Pooling
    Wang D.
    Liu L.
    Cao J.
    Zhao G.
    Zhao W.
    Tang W.
    Qiche Gongcheng/Automotive Engineering, 2022, 44 (12): : 1818 - 1824
  • [17] White blood segmentation based on dual path and atrous spatial pyramid pooling
    Li Z.
    Lu Y.
    Cao X.
    Qiu L.
    Qin X.
    Shengwu Yixue Gongchengxue Zazhi/Journal of Biomedical Engineering, 2022, 39 (03): : 471 - 479
  • [18] Fusion network based on the dual attention mechanism and atrous spatial pyramid pooling for automatic segmentation in retinal vessel images
    Liang, Bingtao
    Tang, Chen
    Xu, Min
    Wu, Tianbo
    Lei, Zhenkun
    Journal of the Optical Society of America A: Optics and Image Science, and Vision, 2022, 39 (08): : 1393 - 1402
  • [19] Fusion network based on the dual attention mechanism and atrous spatial pyramid pooling for automatic segmentation in retinal vessel images
    Liang, Bingtao
    Tang, Chen
    Xu, M. I. N.
    Wu, Tianbo
    Lei, Zhenkun
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2022, 39 (08) : 1393 - 1402
  • [20] Encoder-Decoder Network with Depthwise Atrous Spatial Pyramid Pooling for Automatic Brain Tumor Segmentation
    AboElenein, Nagwa M.
    Piao, Songhao
    Zhang, Zhehong
    NEURAL PROCESSING LETTERS, 2023, 55 (02) : 1697 - 1713