Semantic segmentation using stride spatial pyramid pooling and dual attention decoder

被引:62
|
作者
Peng, Chengli [1 ]
Ma, Jiayi [1 ]
机构
[1] Wuhan Univ, Elect Informat Sch, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantic segmentation; Convolutional neural networks; Pyramid pooling; Attention mechanism; NETWORKS; FORCE;
D O I
10.1016/j.patcog.2020.107498
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic segmentation is an end-to-end task that requires both semantic and spatial accuracy. It is important for deep learning-based segmentation methods to effectively utilize the high-level feature map whose semantic information is abundant and the low-level feature map whose spatial information is accurate. However, existing segmentation networks typically cannot take full advantage of these two kinds of feature maps, leading to inferior performance. This paper attempts to overcome this challenge by introducing two novel structures. On the one hand, we propose a structure called stride spatial pyramid pooling (SSPP) to capture multiscale semantic information from the high-level feature map. Compared with existing pyramid pooling methods based on the atrous convolution, the SSPP structure is able to gather more information from the high-level feature map with faster inference speed, which improves the utilization efficiency of the high-level feature map significantly. On the other hand, we propose a dual attention decoder consisting of a channel attention branch and a spatial attention branch to make full use of the high- and low-level feature maps simultaneously. The dual attention decoder can result in a more "semantic" low-level feature map and a high-level feature map with more accurate spatial information, which bridges the gap between these two kinds of feature maps and benefits their fusion. We evaluate the proposed model on several publicly available semantic image segmentation benchmarks including PASCAL VOC 2012, Cityscapes and COCO-Stuff. The qualitative and quantitative results demonstrate that our method can achieve the state-of-the-art performance. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Lidar Point Semantic Segmentation Using Dual Attention Mechanism
    Haosen Wang
    Yuan Zhou
    Tiankai Chen
    Feng Qian
    Yue Ma
    Shifeng Wang
    Bo Lu
    Journal of Russian Laser Research, 2023, 44 : 224 - 234
  • [42] Multi-scale retinal vessel segmentation using encoder-decoder network with squeeze-and-excitation connection and atrous spatial pyramid pooling
    Xie, Huiying
    Tang, Chen
    Zhang, Wei
    Shen, Yuxin
    Lei, Zhengkun
    APPLIED OPTICS, 2021, 60 (02) : 239 - 249
  • [43] Retinal Vessel Segmentation Using Multi-Directional Stripe Convolution and Pyramid Dual Pooling
    Kong, Linfeng
    Wu, Yun
    LASER & OPTOELECTRONICS PROGRESS, 2025, 62 (02)
  • [44] Pyramid Attention Aggregation Network for Semantic Segmentation of Surgical Instruments
    Ni, Zhen-Liang
    Bian, Gui-Bin
    Wang, Guan-An
    Zhou, Xiao-Hu
    Hou, Zeng-Guang
    Chen, Hua-Bin
    Xie, Xiao-Liang
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11782 - 11790
  • [45] Triple fusion and feature pyramid decoder for RGB-D semantic segmentation
    Ge, Bin
    Zhu, Xu
    Tang, Zihan
    Xia, Chenxing
    Lu, Yiming
    Chen, Zhuang
    MULTIMEDIA SYSTEMS, 2024, 30 (05)
  • [46] Pyramid Pooling Channel Attention Network for esophageal tissue segmentation on OCT images
    Li, Deyin
    Zhang, Miao
    Shi, Wei
    Zhang, Huimin
    Wang, Duoduo
    Wang, Lirong
    2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 1476 - 1480
  • [47] Semantic Segmentation for High Spatial Resolution Remote Sensing Images Based on Convolution Neural Network and Pyramid Pooling Module
    Yu, Bo
    Yang, Lu
    Chen, Fang
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2018, 11 (09) : 3252 - 3261
  • [48] Dual-Branch Attention Network and Atrous Spatial Pyramid Pooling for Diabetic Retinopathy Classification Using Ultra-Widefield Images
    Tian, Zhihui
    Lei, Haijun
    Xie, Hai
    Zeng, Xianlu
    Zhao, Xinyu
    Chen, Miaohong
    Zhang, Guoming
    Lei, Baiying
    OPHTHALMIC MEDICAL IMAGE ANALYSIS, OMIA 2021, 2021, 12970 : 119 - 128
  • [49] Segmentation of ground glass pulmonary nodules using full convolution residual network based on atrous spatial pyramid pooling structure and attention mechanism
    Dong, Ting
    Wei, Long
    Ye, Xiaodan
    Chen, Yang
    Hou, Xuewen
    Nie, Shengdong
    Shengwu Yixue Gongchengxue Zazhi/Journal of Biomedical Engineering, 2022, 39 (03): : 441 - 451
  • [50] Waterfall Atrous Spatial Pooling Architecture for Efficient Semantic Segmentation
    Artacho, Bruno
    Savakis, Andreas
    SENSORS, 2019, 19 (24)