Accelerating Deep Neural Network Tasks Through Edge-Device Adaptive Inference

被引：0

作者：

Zhang, Xinyang ^{[1
]}

Teng, Yinglei ^{[1
]}

Wang, Nan ^{[1
]}

Sun, Boya ^{[1
]}

Hu, Gang ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun BUPT, Beijing Key Lab Spaceground Interconnect & Conver, Xitucheng Rd 10, Beijing 100876, Peoples R China

来源：

2023 IEEE 34TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, PIMRC | 2023年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Deep Neural Networks (DNNs); Edge Computing; Task Offloading; Early Exit; Model Partition;

D O I：

10.1109/PIMRC56721.2023.10293996

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

As the key technology of artificial intelligence(AI), Deep Neural Networks (DNNs) have been widely used in mobile applications, such as video analytics in autonomous driving. However, due to the constrained computation capabilities on mobile devices (MDs), it is challenging to meet the critical accuracy and real-time demand of DNN tasks, which would result in a serious drop in quality of service (QoS). A popular alternative is to offload DNN tasks to edges for intelligence inference, nevertheless, this results in a heavy communication burden due to large amounts of raw data. In this paper, we propose an adaptive DNN co-Inference (ADCI) strategy which obtains the flexible computation division among devices and edge servers with elastic execution by combining the early exit and model partition policies. Establishing a balanced utility function, we jointly optimize dynamic offloading and model adoption while taking into account the multi-user and multi-server edge computing system. To tackle the high coupling among mixed variables, we propose a two-stage deep reinforcement learning (DRL) algorithm. The early-exit and model partition decisions are tracked using the Lagrange method as a soft option. Results from simulations show that the ADCI strategy performs well with timely accuracy

引用

页数：6

共 50 条

[21] Scheduling with Fully Compressible Tasks: Application to Deep Learning Inference with Neural Network Compression
Barros, Tiago da Silva
Giroire, Frederic
Aparicio-Pardo, Ramon
Perennes, Stephane
Natale, Emanuele
2024 IEEE 24TH INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID 2024, 2024, : 340 - 349
[22] Scaling for edge inference of deep neural networks
Xiaowei Xu
Yukun Ding
Sharon Xiaobo Hu
Michael Niemier
Jason Cong
Yu Hu
Yiyu Shi
Nature Electronics, 2018, 1 : 216 - 222
[23] Scaling for edge inference of deep neural networks
Xu, Xiaowei
Ding, Yukun
Hu, Sharon Xiaobo
Niemier, Michael
Cong, Jason
Hu, Yu
Shi, Yiyu
NATURE ELECTRONICS, 2018, 1 (04): : 216 - 222
[24] Memristive-based Mixed-signal CGRA for Accelerating Deep Neural Network Inference
Kazerooni-Zand, Reza
Kamal, Mehdi
Afzali-Kusha, Ali
Pedram, Massoud
ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (04)
[25] Adaptive fuzzy inference neural network
Iyatomi, H
Hagiwara, M
PATTERN RECOGNITION, 2004, 37 (10) : 2049 - 2057
[26] Accelerating Binarized Neural Network Inference by Reusing Operation Results and Elevating Resource Utilization on Edge devices
Huang, Yu-Chang
Tsai, You-Hsuen
Li, Yi-Ting
Chen, Yung-Chih
Wang, Chun-Yao
2023 INTERNATIONAL VLSI SYMPOSIUM ON TECHNOLOGY, SYSTEMS AND APPLICATIONS, VLSI-TSA/VLSI-DAT, 2023,
[27] DeepAdaIn-Net: Deep Adaptive Device-Edge Collaborative Inference for Augmented Reality
Wang, Li
Wu, Xin
Zhang, Yi
Zhang, Xinyun
Xu, Lianming
Wu, Zhihua
Fei, Aiguo
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2023, 17 (05) : 1052 - 1063
[28] A Neural Network Approach to Edge Detection using Adaptive Neuro - Fuzzy Inference System
Anwar, Shamama
Raj, Sugandh
2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 2432 - 2435
[29] Deep Learning Inference on Edge: A Preliminary Device Comparison
Gonzalez, Manuel L.
Ruiz, Jorge
Andres, Lidia
Lozada, Randy
Skibinsky, Erik S.
Fernandez, Jorge
Sedano, Javier
Garcia-Vico, Angel M.
INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2024, PT I, 2025, 15346 : 265 - 276
[30] Dynamic resource allocation for jointing vehicle-edge deep neural network inference
Wang, Qi
Li, Zhiyong
Nai, Ke
Chen, Yifan
Wen, Ming
JOURNAL OF SYSTEMS ARCHITECTURE, 2021, 117

← 1 2 3 4 5 →