ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

被引:613
|
作者
Mehta, Sachin [1 ]
Rastegari, Mohammad [2 ,3 ]
Caspi, Anat [1 ]
Shapiro, Linda [1 ]
Hajishirzi, Hannaneh [1 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
[2] Allen Inst AI, Seattle, WA USA
[3] XNOR AI, Seattle, WA USA
来源
关键词
D O I
10.1007/978-3-030-01249-6_34
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints. ESPNet is based on a new convolutional module, efficient spatial pyramid (ESP), which is efficient in terms of computation, memory, and power. ESPNet is 22 times faster (on a standard GPU) and 180 times smaller than the state-of-the-art semantic segmentation network PSPNet, while its category-wise accuracy is only 8% less. We evaluated ESPNet on a variety of semantic segmentation datasets including Cityscapes, PASCAL VOC, and a breast biopsy whole slide image dataset. Under the same constraints on memory and computation, ESPNet outperforms all the current efficient CNN networks such as MobileNet, ShuffleNet, and ENet on both standard metrics and our newly introduced performance metrics that measure efficiency on edge devices. Our network can process high resolution images at a rate of 112 and 9 frames per second on a standard GPU and edge device, respectively. Our code is open-source and available at https://sacmehta.github.io/ESPNet/.
引用
收藏
页码:561 / 580
页数:20
相关论文
共 50 条
  • [21] Retinal-Layer Segmentation Using Dilated Convolutions
    Reddy, T. Guru Pradeep
    Ashritha, Kandiraju Sai
    Prajwala, T. M.
    Girish, G. N.
    Kothari, Abhishek R.
    Koolagudi, Shashidhar G.
    Rajan, Jeny
    PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON COMPUTER VISION AND IMAGE PROCESSING, CVIP 2018, VOL 1, 2020, 1022 : 279 - 292
  • [22] AGGREGATED DILATED CONVOLUTIONS FOR EFFICIENT MOTION DEBLURRING
    Miao, Hong
    Zhang, Wenqiang
    Bai, Jiansong
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [23] Analysis of Spatial Pyramid Pooling Variations in Semantic Segmentation for Satellite Image Applications
    Abdani, Siti Raihanah
    Zulkifley, Mohd Asyraf
    Zulkifley, Nuraisyah Hani
    2021 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATION (DASA), 2021,
  • [24] AtICNet: semantic segmentation with atrous spatial pyramid pooling in image cascade network
    Chen, Jin
    Wang, Chuanya
    Tong, Ying
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2019, 2019 (1)
  • [25] Semantic segmentation using stride spatial pyramid pooling and dual attention decoder
    Peng, Chengli
    Ma, Jiayi
    PATTERN RECOGNITION, 2020, 107 (107)
  • [26] AtICNet: semantic segmentation with atrous spatial pyramid pooling in image cascade network
    Jin Chen
    Chuanya Wang
    Ying Tong
    EURASIP Journal on Wireless Communications and Networking, 2019
  • [27] An Efficient Solution for Semantic Segmentation: ShuffleNet V2 with Atrous Separable Convolutions
    Turkmen, Sercan
    Heikkila, Janne
    IMAGE ANALYSIS, 2019, 11482 : 41 - 53
  • [28] Pyramid Context Contrast for Semantic Segmentation
    Chen, Yuzhong
    Lin, Yangyang
    Niu, Yuzhen
    Ke, Xiao
    Huang, tengda
    IEEE ACCESS, 2019, 7 : 173679 - 173693
  • [29] Pyramid Fusion Transformer for Semantic Segmentation
    Qin, Zipeng
    Liu, Jianbo
    Zhang, Xiaolin
    Tian, Maoqing
    Zhou, Aojun
    Yi, Shuai
    Li, Hongsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9630 - 9643
  • [30] Semantic segmentation with hybrid pyramid pooling and stacked pyramid structure
    Lian, Xuhang
    Pang, Yanwei
    Han, Jungong
    Pan, Jing
    NEUROCOMPUTING, 2020, 410 : 454 - 467