ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

被引:613
|
作者
Mehta, Sachin [1 ]
Rastegari, Mohammad [2 ,3 ]
Caspi, Anat [1 ]
Shapiro, Linda [1 ]
Hajishirzi, Hannaneh [1 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
[2] Allen Inst AI, Seattle, WA USA
[3] XNOR AI, Seattle, WA USA
来源
关键词
D O I
10.1007/978-3-030-01249-6_34
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints. ESPNet is based on a new convolutional module, efficient spatial pyramid (ESP), which is efficient in terms of computation, memory, and power. ESPNet is 22 times faster (on a standard GPU) and 180 times smaller than the state-of-the-art semantic segmentation network PSPNet, while its category-wise accuracy is only 8% less. We evaluated ESPNet on a variety of semantic segmentation datasets including Cityscapes, PASCAL VOC, and a breast biopsy whole slide image dataset. Under the same constraints on memory and computation, ESPNet outperforms all the current efficient CNN networks such as MobileNet, ShuffleNet, and ENet on both standard metrics and our newly introduced performance metrics that measure efficiency on edge devices. Our network can process high resolution images at a rate of 112 and 9 frames per second on a standard GPU and edge device, respectively. Our code is open-source and available at https://sacmehta.github.io/ESPNet/.
引用
收藏
页码:561 / 580
页数:20
相关论文
共 50 条
  • [41] Global Attention Pyramid Network for Semantic Segmentation
    Zhang, Na
    Li, Jun
    Li, Yongrui
    Du, Yang
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 8728 - 8732
  • [42] Pyramid Geometric Consistency Learning For Semantic Segmentation
    Zhang, Xian
    Li, Qiang
    Quan, Zhibin
    Yang, Wankou
    PATTERN RECOGNITION, 2023, 133
  • [43] Waterfall Atrous Spatial Pooling Architecture for Efficient Semantic Segmentation
    Artacho, Bruno
    Savakis, Andreas
    SENSORS, 2019, 19 (24)
  • [44] Channel-spatial knowledge distillation for efficient semantic segmentation
    Karine, Ayoub
    Napoleon, Thibault
    Jridi, Maher
    PATTERN RECOGNITION LETTERS, 2024, 180 : 48 - 54
  • [45] Enhanced Feature Pyramid Network for Semantic Segmentation
    Ye, Mucong
    Ouyang, Jingpeng
    Chen, Ge
    Zhang, Jing
    Yu, Xiaogang
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3209 - 3216
  • [46] Boundary Aware Semantic Segmentation using Pyramid-dilated Dense U-Net for Lung Segmentation in Computed Tomography Images
    Agnes, S. Akila
    JOURNAL OF MEDICAL PHYSICS, 2023, 48 (02) : 161 - 174
  • [47] Brain Tumor Segmentation in MRI Images using Deformable and Dilated Convolutions
    Amini, Nasim
    Soryani, Mohsen
    Mohammadi, Mohammad Reza
    PROCEEDINGS OF THE 13TH IRANIAN/3RD INTERNATIONAL MACHINE VISION AND IMAGE PROCESSING CONFERENCE, MVIP, 2024, : 232 - 236
  • [48] Fully convolutional network with dilated convolutions for handwritten text line segmentation
    Guillaume Renton
    Yann Soullard
    Clément Chatelain
    Sébastien Adam
    Christopher Kermorvant
    Thierry Paquet
    International Journal on Document Analysis and Recognition (IJDAR), 2018, 21 : 177 - 186
  • [49] Fully convolutional network with dilated convolutions for handwritten text line segmentation
    Renton, Guillaume
    Soullard, Yann
    Chatelain, Clement
    Adam, Sebastien
    Kermorvant, Christopher
    Paquet, Thierry
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2018, 21 (03) : 177 - 186
  • [50] Spinal cord gray matter segmentation using deep dilated convolutions
    Christian S. Perone
    Evan Calabrese
    Julien Cohen-Adad
    Scientific Reports, 8