PLAYS: Minimizing DNN Inference Latency in Serverless Edge Cloud for Artificial Intelligence of Things

被引：1

作者：

Geng, Hongmin ^{[1
,2
]}

Zeng, Deze ^{[1
,2
]}

Li, Yuepeng ^{[3
]}

Gu, Lin ^{[4
]}

Chen, Quan ^{[3
]}

Li, Peng ^{[5
]}

机构：

[1] China Univ Geosci, Engn Res Ctr Nat Resource Informat Management & Di, Minist Educ, Wuhan 430074, Peoples R China

[2] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China

[3] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China

[4] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan 430074, Peoples R China

[5] Univ Aizu, Sch Comp Sci & Engn, Aizu Wakamatsu 9658580, Japan

来源：

IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 23期

基金：

日本学术振兴会; 日本科学技术振兴机构;

关键词：

Containers; Task analysis; Artificial neural networks; Internet of Things; Inference algorithms; Artificial intelligence; Computational modeling; Artificial Intelligence of Things (AIoT); distributed Deep neural network (DNN) inference; serverless edge cloud; task scheduling;

D O I：

10.1109/JIOT.2024.3443289

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Thanks to the capability of fine-grained resource allocation and fast task scheduling, serverless computing has been adopted into edge cloud to accommodate various applications, e.g., deep neural network (DNN) inference for Artificial Intelligence of Things (AIoT). In serverless edge cloud, the servers are started up on-demand. However, as a container-based architecture, the inherent sequential startup feature of container imposes high affection on the DNN inference performance in serverless edge clouds. In this article, we investigate the distributed DNN inference problem in serverless edge cloud with the consideration of such characteristics, aiming to eliminate the extra container startup time cost to minimize the DNN inference latency. We formulate this problem into a nonlinear optimization form and then linearize it into an integer programming problem, which is proved as NP-hard. To tackle the computation complexity, we propose a priority-based layer scheduling (PLAYS) algorithm. Extensive experiment results verify the effectiveness and the adaptability of our PLAYS algorithm in comparison with other state-of-art algorithms under several well known DNN models.

引用

页码：37731 / 37740

页数：10

共 50 条

[21] Exploring In-Memory Accelerators and FPGAs for Latency-Sensitive DNN Inference on Edge Servers
Suvizi, Ali
Subramaniam, Suresh
Lan, Tian
Venkataramani, Guru
2024 IEEE CLOUD SUMMIT, CLOUD SUMMIT 2024, 2024, : 1 - 6
[22] nn- METER: TOWARDS ACCURATE LATENCY PREDICTION OF DNN INFERENCE ON DIVERSE EDGE DEVICES
Zhang, Li Lyna
Han, Shihao
Wei, Jianyu
Zheng, Ningxin
Cao, Ting
Yang, Yuqing
Liu, Yunxin
GETMOBILE-MOBILE COMPUTING & COMMUNICATIONS REVIEW, 2021, 25 (04) : 19 - 23
[23] Research on cloud-edge joint task inference algorithm in edge intelligence
Zheng, Yaping
Journal of Computers (Taiwan), 2021, 32 (04) : 211 - 224
[24] EINS: Edge-Cloud Deep Model Inference with Network-Efficiency Schedule in Serverless
Peng, Shijie
Lin, Yanying
Chen, Wenyan
Tang, Yingfei
Duan, Xu
Ye, Kejiang
PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 1376 - 1381
[25] An adaptive DNN inference acceleration framework with end-edge-cloud collaborative computing
Liu, Guozhi
Dai, Fei
Xu, Xiaolong
Fu, Xiaodong
Dou, Wanchun
Kumar, Neeraj
Bilal, Muhammad
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 140 : 422 - 435
[26] AppealNet: An Efficient and Highly-Accurate Edge/Cloud Collaborative Architecture for DNN Inference
Li, Min
Li, Yu
Tian, Ye
Jiang, Li
Xu, Qiang
2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 409 - 414
[27] Guardians of the Deep Fog: Failure-Resilient DNN Inference from Edge to Cloud
Yousefpour, Ashkan
Devic, Siddartha
Nguyen, Brian Q.
Kreidieh, Aboudy
Liao, Alan
Bayen, Alexandre M.
Jue, Jason P.
PROCEEDINGS OF THE 2019 INTERNATIONAL WORKSHOP ON CHALLENGES IN ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR INTERNET OF THINGS (AICHALLENGEIOT '19), 2019, : 25 - 31
[28] A Low-Latency Edge Computation Offloading Scheme for Trust Evaluation in Finance-Level Artificial Intelligence of Things
Zhu, Xiaogang
Ma, Feicheng
Ding, Feng
Guo, Zhiwei
Yang, Junchao
Yu, Keping
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (01): : 114 - 124
[29] Going to the Edge - Bringing Internet of Things and Artificial Intelligence Together
Karner, Michael
Hillebrand, Joachim
Klocker, Manuela
Samano-Robles, Ramiro
2021 24TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD 2021), 2021, : 295 - 302
[30] A Bibliometric Analysis of Convergence of Artificial Intelligence and Blockchain for Edge of Things
Deepak Sharma
Rajeev Kumar
Ki-Hyun Jung
Journal of Grid Computing, 2023, 21

← 1 2 3 4 5 →