Semantic segmentation using stride spatial pyramid pooling and dual attention decoder

被引:62
|
作者
Peng, Chengli [1 ]
Ma, Jiayi [1 ]
机构
[1] Wuhan Univ, Elect Informat Sch, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantic segmentation; Convolutional neural networks; Pyramid pooling; Attention mechanism; NETWORKS; FORCE;
D O I
10.1016/j.patcog.2020.107498
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic segmentation is an end-to-end task that requires both semantic and spatial accuracy. It is important for deep learning-based segmentation methods to effectively utilize the high-level feature map whose semantic information is abundant and the low-level feature map whose spatial information is accurate. However, existing segmentation networks typically cannot take full advantage of these two kinds of feature maps, leading to inferior performance. This paper attempts to overcome this challenge by introducing two novel structures. On the one hand, we propose a structure called stride spatial pyramid pooling (SSPP) to capture multiscale semantic information from the high-level feature map. Compared with existing pyramid pooling methods based on the atrous convolution, the SSPP structure is able to gather more information from the high-level feature map with faster inference speed, which improves the utilization efficiency of the high-level feature map significantly. On the other hand, we propose a dual attention decoder consisting of a channel attention branch and a spatial attention branch to make full use of the high- and low-level feature maps simultaneously. The dual attention decoder can result in a more "semantic" low-level feature map and a high-level feature map with more accurate spatial information, which bridges the gap between these two kinds of feature maps and benefits their fusion. We evaluate the proposed model on several publicly available semantic image segmentation benchmarks including PASCAL VOC 2012, Cityscapes and COCO-Stuff. The qualitative and quantitative results demonstrate that our method can achieve the state-of-the-art performance. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Pyramid Self-attention for Semantic Segmentation
    Qi, Jiyang
    Wang, Xinggang
    Hu, Yao
    Tang, Xu
    Liu, Wenyu
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, 2021, 13019 : 480 - 492
  • [22] Dense Semantic Labeling with Atrous Spatial Pyramid Pooling and Decoder for High-Resolution Remote Sensing Imagery
    Wang, Yuhao
    Liang, Binxiu
    Ding, Meng
    Li, Jiangyun
    REMOTE SENSING, 2019, 11 (01)
  • [23] Efficient Attention Pyramid Network for Semantic Segmentation
    Yang, Qirui
    Ku, Tao
    Hu, Kunyuan
    IEEE ACCESS, 2021, 9 : 18867 - 18875
  • [24] Global Attention Pyramid Network for Semantic Segmentation
    Zhang, Na
    Li, Jun
    Li, Yongrui
    Du, Yang
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 8728 - 8732
  • [25] Image segmentation of skin lesions based on dense atrous spatial pyramid pooling and attention mechanism
    Yin W.
    Zhou D.
    Fan T.
    Yu Z.
    Li Z.
    Shengwu Yixue Gongchengxue Zazhi/Journal of Biomedical Engineering, 2022, 39 (06): : 1108 - 1116
  • [26] DUAL-BRANCH ATTENTION NETWORK AND SWIN SPATIAL PYRAMID POOLING FOR RETINOPATHY OF PREMATURITY CLASSIFICATION
    Zhao, Jia
    Lei, Haijun
    Xie, Hai
    Li, Pingkang
    Liu, Yaling
    Zhang, Guoming
    Lei, Baiying
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [27] PPNet : pooling position attention network for semantic segmentation
    Xu, Haixia
    Wang, Wei
    Wang, Shuailong
    Zhou, Wei
    Chen, Qi
    Peng, Wei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (12) : 37007 - 37023
  • [28] PPNet : pooling position attention network for semantic segmentation
    Haixia Xu
    Wei Wang
    Shuailong Wang
    Wei Zhou
    Qi Chen
    Wei Peng
    Multimedia Tools and Applications, 2024, 83 : 37007 - 37023
  • [29] Convolution-deconvolution architecture with the pyramid pooling module for semantic segmentation
    Amirhossein Malekijoo
    Mohammad Javad Fadaeieslam
    Multimedia Tools and Applications, 2019, 78 : 32379 - 32392
  • [30] Convolution-deconvolution architecture with the pyramid pooling module for semantic segmentation
    Malekijoo, Amirhossein
    Fadaeieslam, Mohammad Javad
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (22) : 32379 - 32392