HAPI: Hardware-Aware Progressive Inference

被引:23
|
作者
Laskaridis, Stefanos [1 ]
Venieris, Stylianos, I [1 ]
Kim, Hyeji [1 ]
Lane, Nicholas D. [1 ,2 ]
机构
[1] Samsung AI Ctr, Cambridge, England
[2] Univ Cambridge, Cambridge, England
关键词
MULTIOBJECTIVE OPTIMIZATION;
D O I
10.1145/3400302.3415698
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional neural networks (CNNs) have recently become the state-of-the-art in a diversity of AI tasks. Despite their popularity, CNN inference still comes at a high computational cost. A growing body of work aims to alleviate this by exploiting the difference in the classification difficulty among samples and early-exiting at different stages of the network. Nevertheless, existing studies on early exiting have primarily focused on the training scheme, without considering the use-case requirements or the deployment platform. This work presents HAPI, a novel methodology for generating high-performance early-exit networks by co-optimising the placement of intermediate exits together with the early-exit strategy at inference time. Furthermore, we propose an efficient design space exploration algorithm which enables the faster traversal of a large number of alternative architectures and generates the highest-performing design, tailored to the use-case requirements and target hardware. Quantitative evaluation shows that our system consistently outperforms alternative search mechanisms and state-of-the-art early-exit schemes across various latency budgets. Moreover, it pushes further the performance of highly optimised hand-crafted early-exit CNNs, delivering up to 5.11x speedup over lightweight models on imposed latency-driven SLAs for embedded devices.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] HAO: Hardware-aware Neural Architecture Optimization for Efficient Inference
    Dong, Zhen
    Gao, Yizhao
    Huang, Qijing
    Wawrzynek, John
    So, Hayden K. H.
    Keutzer, Kurt
    2021 IEEE 29TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2021), 2021, : 50 - 59
  • [2] Hardware-aware Partitioning of Convolutional Neural Network Inference for Embedded AI Applications
    Kress, Fabian
    Hoefer, Julian
    Hotfilter, Tim
    Walter, Iris
    Sidorenko, Vladimir
    Harbaum, Tanja
    Becker, Juergen
    18TH ANNUAL INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SENSOR SYSTEMS (DCOSS 2022), 2022, : 133 - 140
  • [3] Hardware-Aware Evolutionary Filter Pruning
    Heidorn, Christian
    Meyerhoefer, Nicolai
    Schinabeck, Christian
    Hannig, Frank
    Teich, Juergen
    EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2022, 2022, 13511 : 283 - 299
  • [4] Hardware-Aware design for edge intelligence
    Gross W.J.
    Meyer B.H.
    Ardakani A.
    IEEE Open Journal of Circuits and Systems, 2021, 2 : 113 - 127
  • [5] Hardware-Aware Quadrature Spatial Modulation
    Celik, Yasin
    Colak, Sultan Aldirmaz
    2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2022,
  • [6] Neuromorphic Silicon Photonics and Hardware-Aware Deep Learning for High-Speed Inference
    Moralis-Pegios, Miltiadis
    Mourgias-Alexandris, George
    Tsakyridis, Apostolos
    Giamougiannis, George
    Totovic, Angelina
    Dabos, George
    Passalis, Nikolaos
    Kirtas, Manos
    Rutirawut, T.
    Gardes, F. Y.
    Tefas, Anastasios
    Pleros, Nikos
    JOURNAL OF LIGHTWAVE TECHNOLOGY, 2022, 40 (10) : 3243 - 3254
  • [7] On Hardware-Aware Design and Optimization of Edge Intelligence
    Huai, Shuo
    Kong, Hao
    Luo, Xiangzhong
    Liu, Di
    Subramaniam, Ravi
    Makaya, Christian
    Lin, Qian
    Liu, Weichen
    IEEE DESIGN & TEST, 2023, 40 (06) : 149 - 162
  • [8] A Hardware-Aware Debugger for the OpenGL Shading Language
    Strengert, Magnus
    Klein, Thomas
    Ertl, Thomas
    GRAPHICS HARDWARE 2007: ACM SIGGRAPH / EUROGRAPHICS SYMPOSIUM PROCEEDINGS, 2007, : 81 - +
  • [9] Hardware-Aware Analysis and Optimization of Stable Fluids
    Kim, Theodore
    I3D 2008: SYMPOSIUM ON INTERACTIVE 3D GRAPHICS AND GAMES, PROCEEDINGS, 2008, : 99 - 106
  • [10] SqueezeNext: Hardware-Aware Neural Network Design
    Gholami, Amir
    Kwon, Kiseok
    Wu, Bichen
    Tai, Zizheng
    Yue, Xiangyu
    Jin, Peter
    Zhao, Sicheng
    Keutzer, Kurt
    PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 1719 - 1728