HAPI: Hardware-Aware Progressive Inference

被引:23
|
作者
Laskaridis, Stefanos [1 ]
Venieris, Stylianos, I [1 ]
Kim, Hyeji [1 ]
Lane, Nicholas D. [1 ,2 ]
机构
[1] Samsung AI Ctr, Cambridge, England
[2] Univ Cambridge, Cambridge, England
关键词
MULTIOBJECTIVE OPTIMIZATION;
D O I
10.1145/3400302.3415698
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional neural networks (CNNs) have recently become the state-of-the-art in a diversity of AI tasks. Despite their popularity, CNN inference still comes at a high computational cost. A growing body of work aims to alleviate this by exploiting the difference in the classification difficulty among samples and early-exiting at different stages of the network. Nevertheless, existing studies on early exiting have primarily focused on the training scheme, without considering the use-case requirements or the deployment platform. This work presents HAPI, a novel methodology for generating high-performance early-exit networks by co-optimising the placement of intermediate exits together with the early-exit strategy at inference time. Furthermore, we propose an efficient design space exploration algorithm which enables the faster traversal of a large number of alternative architectures and generates the highest-performing design, tailored to the use-case requirements and target hardware. Quantitative evaluation shows that our system consistently outperforms alternative search mechanisms and state-of-the-art early-exit schemes across various latency budgets. Moreover, it pushes further the performance of highly optimised hand-crafted early-exit CNNs, delivering up to 5.11x speedup over lightweight models on imposed latency-driven SLAs for embedded devices.
引用
收藏
页数:9
相关论文
共 50 条
  • [11] Hardware-Aware Static Optimization of Hyperdimensional Computations
    Yi, Pu
    Achour, Sara
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2023, 7 (OOPSLA):
  • [12] An Investigation on Hardware-Aware Vision Transformer Scaling
    Li, Chaojian
    Kim, Kyungmin
    Wu, Bichen
    Zhang, Peizhao
    Zhang, Hang
    Dai, Xiaoliang
    Vajda, Peter
    Lin, Yingyan
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2024, 23 (03)
  • [13] Hardware-Aware Machine Learning: Modeling and Optimization
    Marculescu, Diana
    Stamoulis, Dimitrios
    Cai, Ermao
    2018 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD) DIGEST OF TECHNICAL PAPERS, 2018,
  • [14] Fast Hardware-Aware Neural Architecture Search
    Zhang, Li Lyna
    Yang, Yuqing
    Jiang, Yuhang
    Zhu, Wenwu
    Liu, Yunxin
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 2959 - 2967
  • [15] TinyOdom: Hardware-Aware Efficient Neural Inertial Navigation
    Saha, Swapnil Sayan
    Sandha, Sandeep Singh
    Garcia, Luis Antonio
    Srivastava, Mani
    PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2022, 6 (02):
  • [16] HAQ: Hardware-Aware Automated Quantization with Mixed Precision
    Wang, Kuan
    Liu, Zhijian
    Lin, Yujun
    Lin, Ji
    Han, Song
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8604 - 8612
  • [17] Improved Scalability by Using Hardware-Aware Thread Affinities
    Mallach, Sven
    Gutwenger, Carsten
    FACING THE MULTICORE-CHALLENGE: ASPECTS OF NEW PARADIGMS AND TECHNOLOGIES IN PARALLEL COMPUTING, 2010, 6310 : 29 - +
  • [18] Towards Hardware-Aware Tractable Learning of Probabilistic Models
    Olascoaga, Laura I. Galindez
    Meert, Wannes
    Shah, Nimish
    Verhelst, Marian
    Van den Broeck, Guy
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [19] Block-Level Surrogate Models for Inference Time Estimation in Hardware-Aware Neural Architecture Search
    Stolle, Kurt
    Vogel, Sebastian
    van der Sommen, Fons
    Sanberg, Willem
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT V, 2023, 13717 : 463 - 479
  • [20] Hardware-Aware Neural Architecture Search: Survey and Taxonomy
    Benmeziane, Hadjer
    El Maghraoui, Kaoutar
    Ouarnoughi, Hamza
    Niar, Smail
    Wistuba, Martin
    Wang, Naigang
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4322 - 4329