PLAYS: Minimizing DNN Inference Latency in Serverless Edge Cloud for Artificial Intelligence of Things

被引:1
|
作者
Geng, Hongmin [1 ,2 ]
Zeng, Deze [1 ,2 ]
Li, Yuepeng [3 ]
Gu, Lin [4 ]
Chen, Quan [3 ]
Li, Peng [5 ]
机构
[1] China Univ Geosci, Engn Res Ctr Nat Resource Informat Management & Di, Minist Educ, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
[3] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[4] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan 430074, Peoples R China
[5] Univ Aizu, Sch Comp Sci & Engn, Aizu Wakamatsu 9658580, Japan
来源
IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 23期
基金
日本学术振兴会; 日本科学技术振兴机构;
关键词
Containers; Task analysis; Artificial neural networks; Internet of Things; Inference algorithms; Artificial intelligence; Computational modeling; Artificial Intelligence of Things (AIoT); distributed Deep neural network (DNN) inference; serverless edge cloud; task scheduling;
D O I
10.1109/JIOT.2024.3443289
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Thanks to the capability of fine-grained resource allocation and fast task scheduling, serverless computing has been adopted into edge cloud to accommodate various applications, e.g., deep neural network (DNN) inference for Artificial Intelligence of Things (AIoT). In serverless edge cloud, the servers are started up on-demand. However, as a container-based architecture, the inherent sequential startup feature of container imposes high affection on the DNN inference performance in serverless edge clouds. In this article, we investigate the distributed DNN inference problem in serverless edge cloud with the consideration of such characteristics, aiming to eliminate the extra container startup time cost to minimize the DNN inference latency. We formulate this problem into a nonlinear optimization form and then linearize it into an integer programming problem, which is proved as NP-hard. To tackle the computation complexity, we propose a priority-based layer scheduling (PLAYS) algorithm. Extensive experiment results verify the effectiveness and the adaptability of our PLAYS algorithm in comparison with other state-of-art algorithms under several well known DNN models.
引用
收藏
页码:37731 / 37740
页数:10
相关论文
共 50 条
  • [21] Exploring In-Memory Accelerators and FPGAs for Latency-Sensitive DNN Inference on Edge Servers
    Suvizi, Ali
    Subramaniam, Suresh
    Lan, Tian
    Venkataramani, Guru
    2024 IEEE CLOUD SUMMIT, CLOUD SUMMIT 2024, 2024, : 1 - 6
  • [22] nn- METER: TOWARDS ACCURATE LATENCY PREDICTION OF DNN INFERENCE ON DIVERSE EDGE DEVICES
    Zhang, Li Lyna
    Han, Shihao
    Wei, Jianyu
    Zheng, Ningxin
    Cao, Ting
    Yang, Yuqing
    Liu, Yunxin
    GETMOBILE-MOBILE COMPUTING & COMMUNICATIONS REVIEW, 2021, 25 (04) : 19 - 23
  • [23] Research on cloud-edge joint task inference algorithm in edge intelligence
    Zheng, Yaping
    Journal of Computers (Taiwan), 2021, 32 (04) : 211 - 224
  • [24] EINS: Edge-Cloud Deep Model Inference with Network-Efficiency Schedule in Serverless
    Peng, Shijie
    Lin, Yanying
    Chen, Wenyan
    Tang, Yingfei
    Duan, Xu
    Ye, Kejiang
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 1376 - 1381
  • [25] An adaptive DNN inference acceleration framework with end-edge-cloud collaborative computing
    Liu, Guozhi
    Dai, Fei
    Xu, Xiaolong
    Fu, Xiaodong
    Dou, Wanchun
    Kumar, Neeraj
    Bilal, Muhammad
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 140 : 422 - 435
  • [26] AppealNet: An Efficient and Highly-Accurate Edge/Cloud Collaborative Architecture for DNN Inference
    Li, Min
    Li, Yu
    Tian, Ye
    Jiang, Li
    Xu, Qiang
    2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 409 - 414
  • [27] Guardians of the Deep Fog: Failure-Resilient DNN Inference from Edge to Cloud
    Yousefpour, Ashkan
    Devic, Siddartha
    Nguyen, Brian Q.
    Kreidieh, Aboudy
    Liao, Alan
    Bayen, Alexandre M.
    Jue, Jason P.
    PROCEEDINGS OF THE 2019 INTERNATIONAL WORKSHOP ON CHALLENGES IN ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR INTERNET OF THINGS (AICHALLENGEIOT '19), 2019, : 25 - 31
  • [28] A Low-Latency Edge Computation Offloading Scheme for Trust Evaluation in Finance-Level Artificial Intelligence of Things
    Zhu, Xiaogang
    Ma, Feicheng
    Ding, Feng
    Guo, Zhiwei
    Yang, Junchao
    Yu, Keping
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (01): : 114 - 124
  • [29] Going to the Edge - Bringing Internet of Things and Artificial Intelligence Together
    Karner, Michael
    Hillebrand, Joachim
    Klocker, Manuela
    Samano-Robles, Ramiro
    2021 24TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD 2021), 2021, : 295 - 302
  • [30] A Bibliometric Analysis of Convergence of Artificial Intelligence and Blockchain for Edge of Things
    Deepak Sharma
    Rajeev Kumar
    Ki-Hyun Jung
    Journal of Grid Computing, 2023, 21