HIDL: High-Throughput Deep Learning Inference at the Hybrid Mobile Edge

被引：31

作者：

Wu, Jing ^{[1
]}

Wang, Lin ^{[2
,3
]}

Pei, Qiangyu ^{[1
]}

Cui, Xingqi ^{[1
]}

Liu, Fangming ^{[1
]}

Yang, Tingting ^{[4
]}

机构：

[1] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Serv Comp Technol & Syst Lab, Cluster & Grid Comp Lab,Sch Comp Sci & Technol, Wuhan 430074, Peoples R China

[2] Vrije Univ Amsterdam, NL-1081 HV Amsterdam, Netherlands

[3] Tech Univ Darmstadt, D-64289 Darmstadt, Germany

[4] Peng Cheng Lab, Shenzhen 518066, Peoples R China

来源：

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | 2022年 / 33卷 / 12期

关键词：

Deep learning inference; edge computing; resource allocation; systems for machine learning;

D O I：

10.1109/TPDS.2022.3195664

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Deep neural networks (DNNs) have become a critical component for inference in modem mobile applications, but the efficient provisioning of DNNs is non-trivial. Existing mobile- and server-based approaches compromise either the inference accuracy or latency. Instead, a hybrid approach can reap the benefits of the two by splitting the DNN at an appropriate layer and running the two parts separately on the mobile and the server respectively. Nevertheless, the DNN throughput in the hybrid approach has not been carefully examined, which is particularly important for edge servers where limited compute resources are shared among multiple DNNs. This article presents HiTDL, a runtime framework for managing multiple DNNs provisioned following the hybrid approach at the edge. HiTDL's mission is to improve edge resource efficiency by optimizing the combined throughput of all co-located DNNs, while still guaranteeing their SLAB. To this end, HiTDL first builds comprehensive performance models for DNN inference latency and throughout with respect to multiple factors including resource availability, DNN partition plan, and cross-DNN interference. HiTDL then uses these models to generate a set of candidate partition plans with SLA guarantees for each DNN. Finally, HiTDL makes global throughput-optimal resource allocation decisions by selecting partition plans from the candidate set for each DNN via solving a fairness-aware multiple-choice knapsack problem. Experimental results based on a prototype implementation show that HiTDL improves the overall throughput of the edge by 4.3x compared with the state-of-the-art.

引用

页码：4499 / 4514

页数：16

共 50 条

[1] Deep Learning Inference at the Edge for Mobile and Aerial Robotics
Faniadis, Efstathios
Amanatiadis, Angelos
2020 IEEE INTERNATIONAL SYMPOSIUM ON SAFETY, SECURITY, AND RESCUE ROBOTICS (SSRR 2020), 2020, : 334 - 340
[2] WidePipe: High-Throughput Deep Learning Inference System on a Cluster of Neural Processing Units
Ma, Lixian
Shao, En
Zhou, Yueyuan
Tan, Guangming
2021 IEEE 39TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2021), 2021, : 563 - 566
[3] Distributed and Collaborative High-Speed Inference Deep Learning for Mobile Edge with Topological Dependencies
Henna, Shagufta
Davy, Alan
IEEE TRANSACTIONS ON CLOUD COMPUTING, 2022, 10 (02) : 821 - 834
[4] High-throughput segmentation of unmyelinated axons by deep learning
Emanuele Plebani
Natalia P. Biscola
Leif A. Havton
Bartek Rajwa
Abida Sanjana Shemonti
Deborah Jaffey
Terry Powley
Janet R. Keast
Kun-Han Lu
M. Murat Dundar
Scientific Reports, 12
[5] High-throughput segmentation of unmyelinated axons by deep learning
Plebani, Emanuele
Biscola, Natalia P.
Havton, Leif A.
Rajwa, Bartek
Shemonti, Abida Sanjana
Jaffey, Deborah
Powley, Terry
Keast, Janet R.
Lu, Kun-Han
Dundar, M. Murat
SCIENTIFIC REPORTS, 2022, 12 (01)
[6] High-Throughput Deep Learning Detection of Mitral Regurgitation
Vrudhula, Amey
Duffy, Grant
Vukadinovic, Milos
Liang, David
Cheng, Susan
Ouyang, David
CIRCULATION, 2024, 150 (12) : 923 - 933
[7] High-Throughput DNN Inference with LogicNets
Umuroglu, Yaman
Akhauri, Yash
Fraser, Nicholas J.
Blott, Michaela
28TH IEEE INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2020, : 238 - 238
[8] On-Edge High-Throughput Collaborative Inference for Real-Time Video Analytics
Wang, Xingwang
Shen, Muzi
Yang, Kun
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (20): : 33097 - 33109
[9] High-Throughput Edge Inference for BERT Models via Neural Architecture Search and Pipeline
Chang, Hung-Yang
Mozafari, Seyyed Hasan
Clark, James J.
Meyer, Brett H.
Gross, Warren J.
PROCEEDINGS OF THE GREAT LAKES SYMPOSIUM ON VLSI 2023, GLSVLSI 2023, 2023, : 455 - 459
[10] Application of deep learning for high-throughput phenotyping of seed: a review
Jin, Chen
Zhou, Lei
Pu, Yuanyuan
Zhang, Chu
Qi, Hengnian
Zhao, Yiying
ARTIFICIAL INTELLIGENCE REVIEW, 2025, 58 (03)

← 1 2 3 4 5 →