ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

被引:613
|
作者
Mehta, Sachin [1 ]
Rastegari, Mohammad [2 ,3 ]
Caspi, Anat [1 ]
Shapiro, Linda [1 ]
Hajishirzi, Hannaneh [1 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
[2] Allen Inst AI, Seattle, WA USA
[3] XNOR AI, Seattle, WA USA
来源
关键词
D O I
10.1007/978-3-030-01249-6_34
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints. ESPNet is based on a new convolutional module, efficient spatial pyramid (ESP), which is efficient in terms of computation, memory, and power. ESPNet is 22 times faster (on a standard GPU) and 180 times smaller than the state-of-the-art semantic segmentation network PSPNet, while its category-wise accuracy is only 8% less. We evaluated ESPNet on a variety of semantic segmentation datasets including Cityscapes, PASCAL VOC, and a breast biopsy whole slide image dataset. Under the same constraints on memory and computation, ESPNet outperforms all the current efficient CNN networks such as MobileNet, ShuffleNet, and ENet on both standard metrics and our newly introduced performance metrics that measure efficiency on edge devices. Our network can process high resolution images at a rate of 112 and 9 frames per second on a standard GPU and edge device, respectively. Our code is open-source and available at https://sacmehta.github.io/ESPNet/.
引用
收藏
页码:561 / 580
页数:20
相关论文
共 50 条
  • [1] Efficient Fast Semantic Segmentation Using Continuous Shuffle Dilated Convolutions
    Hu, Xuegang
    Wang, Haibo
    IEEE ACCESS, 2020, 8 : 70913 - 70924
  • [2] Efficient Semantic Segmentation Using Spatio-Channel Dilated Convolutions
    Kim, Jaeseon
    Heo, Yong Seok
    IEEE ACCESS, 2019, 7 : 154239 - 154252
  • [3] Mixed spatial pyramid pooling for semantic segmentation
    Xia, Zhengyu
    Kim, Joohee
    APPLIED SOFT COMPUTING, 2020, 91
  • [4] A Unified Efficient Pyramid Transformer for Semantic Segmentation
    Zhu, Fangrui
    Zhu, Yi
    Zhang, Li
    Wu, Chongruo
    Fu, Yanwei
    Li, Mu
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 2667 - 2677
  • [5] Efficient Attention Pyramid Network for Semantic Segmentation
    Yang, Qirui
    Ku, Tao
    Hu, Kunyuan
    IEEE ACCESS, 2021, 9 : 18867 - 18875
  • [6] Fast and Accurate Real-Time Semantic Segmentation with Dilated Asymmetric Convolutions
    Rosas-Arias, Leonel
    Benitez-Garcia, Gibran
    Portillo-Portillo, Jose
    Sanchez-Perez, Gabriel
    Yanai, Keiji
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2264 - 2271
  • [7] Large Kernel Spatial Pyramid Pooling for Semantic Segmentation
    Yang, Jiayi
    Hu, Tianshi
    Yang, Junli
    Zhang, Zhaoxing
    Pan, Yue
    IMAGE AND GRAPHICS, ICIG 2019, PT I, 2019, 11901 : 595 - 605
  • [8] Spatial Pyramid Based Graph Reasoning for Semantic Segmentation
    Li, Xia
    Yang, Yibo
    Zhao, Qijie
    Shen, Tiancheng
    Lin, Zhouchen
    Liu, Hong
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 8947 - 8956
  • [9] Parallel Dense Merging Network with Dilated Convolutions for Semantic Segmentation of Sports Movement Scene
    Huang, Dongya
    Zhang, Li
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2022, 16 (11): : 3493 - 3506
  • [10] EADNET: EFFICIENT ASYMMETRIC DILATED NETWORK FOR SEMANTIC SEGMENTATION
    Yang, Qihang
    Chen, Tao
    Fang, Jiayuan
    Lu, Ye
    Zuo, Chongyan
    Chi, Qinghua
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2315 - 2319