HIDL: High-Throughput Deep Learning Inference at the Hybrid Mobile Edge

被引:31
|
作者
Wu, Jing [1 ]
Wang, Lin [2 ,3 ]
Pei, Qiangyu [1 ]
Cui, Xingqi [1 ]
Liu, Fangming [1 ]
Yang, Tingting [4 ]
机构
[1] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Serv Comp Technol & Syst Lab, Cluster & Grid Comp Lab,Sch Comp Sci & Technol, Wuhan 430074, Peoples R China
[2] Vrije Univ Amsterdam, NL-1081 HV Amsterdam, Netherlands
[3] Tech Univ Darmstadt, D-64289 Darmstadt, Germany
[4] Peng Cheng Lab, Shenzhen 518066, Peoples R China
关键词
Deep learning inference; edge computing; resource allocation; systems for machine learning;
D O I
10.1109/TPDS.2022.3195664
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Deep neural networks (DNNs) have become a critical component for inference in modem mobile applications, but the efficient provisioning of DNNs is non-trivial. Existing mobile- and server-based approaches compromise either the inference accuracy or latency. Instead, a hybrid approach can reap the benefits of the two by splitting the DNN at an appropriate layer and running the two parts separately on the mobile and the server respectively. Nevertheless, the DNN throughput in the hybrid approach has not been carefully examined, which is particularly important for edge servers where limited compute resources are shared among multiple DNNs. This article presents HiTDL, a runtime framework for managing multiple DNNs provisioned following the hybrid approach at the edge. HiTDL's mission is to improve edge resource efficiency by optimizing the combined throughput of all co-located DNNs, while still guaranteeing their SLAB. To this end, HiTDL first builds comprehensive performance models for DNN inference latency and throughout with respect to multiple factors including resource availability, DNN partition plan, and cross-DNN interference. HiTDL then uses these models to generate a set of candidate partition plans with SLA guarantees for each DNN. Finally, HiTDL makes global throughput-optimal resource allocation decisions by selecting partition plans from the candidate set for each DNN via solving a fairness-aware multiple-choice knapsack problem. Experimental results based on a prototype implementation show that HiTDL improves the overall throughput of the edge by 4.3x compared with the state-of-the-art.
引用
收藏
页码:4499 / 4514
页数:16
相关论文
共 50 条
  • [1] Deep Learning Inference at the Edge for Mobile and Aerial Robotics
    Faniadis, Efstathios
    Amanatiadis, Angelos
    2020 IEEE INTERNATIONAL SYMPOSIUM ON SAFETY, SECURITY, AND RESCUE ROBOTICS (SSRR 2020), 2020, : 334 - 340
  • [2] WidePipe: High-Throughput Deep Learning Inference System on a Cluster of Neural Processing Units
    Ma, Lixian
    Shao, En
    Zhou, Yueyuan
    Tan, Guangming
    2021 IEEE 39TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2021), 2021, : 563 - 566
  • [3] Distributed and Collaborative High-Speed Inference Deep Learning for Mobile Edge with Topological Dependencies
    Henna, Shagufta
    Davy, Alan
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2022, 10 (02) : 821 - 834
  • [4] High-throughput segmentation of unmyelinated axons by deep learning
    Emanuele Plebani
    Natalia P. Biscola
    Leif A. Havton
    Bartek Rajwa
    Abida Sanjana Shemonti
    Deborah Jaffey
    Terry Powley
    Janet R. Keast
    Kun-Han Lu
    M. Murat Dundar
    Scientific Reports, 12
  • [5] High-throughput segmentation of unmyelinated axons by deep learning
    Plebani, Emanuele
    Biscola, Natalia P.
    Havton, Leif A.
    Rajwa, Bartek
    Shemonti, Abida Sanjana
    Jaffey, Deborah
    Powley, Terry
    Keast, Janet R.
    Lu, Kun-Han
    Dundar, M. Murat
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [6] High-Throughput Deep Learning Detection of Mitral Regurgitation
    Vrudhula, Amey
    Duffy, Grant
    Vukadinovic, Milos
    Liang, David
    Cheng, Susan
    Ouyang, David
    CIRCULATION, 2024, 150 (12) : 923 - 933
  • [7] High-Throughput DNN Inference with LogicNets
    Umuroglu, Yaman
    Akhauri, Yash
    Fraser, Nicholas J.
    Blott, Michaela
    28TH IEEE INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2020, : 238 - 238
  • [8] On-Edge High-Throughput Collaborative Inference for Real-Time Video Analytics
    Wang, Xingwang
    Shen, Muzi
    Yang, Kun
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (20): : 33097 - 33109
  • [9] High-Throughput Edge Inference for BERT Models via Neural Architecture Search and Pipeline
    Chang, Hung-Yang
    Mozafari, Seyyed Hasan
    Clark, James J.
    Meyer, Brett H.
    Gross, Warren J.
    PROCEEDINGS OF THE GREAT LAKES SYMPOSIUM ON VLSI 2023, GLSVLSI 2023, 2023, : 455 - 459
  • [10] Application of deep learning for high-throughput phenotyping of seed: a review
    Jin, Chen
    Zhou, Lei
    Pu, Yuanyuan
    Zhang, Chu
    Qi, Hengnian
    Zhao, Yiying
    ARTIFICIAL INTELLIGENCE REVIEW, 2025, 58 (03)