Accelerating Deep Neural Network Tasks Through Edge-Device Adaptive Inference

被引：0

作者：

Zhang, Xinyang ^{[1
]}

Teng, Yinglei ^{[1
]}

Wang, Nan ^{[1
]}

Sun, Boya ^{[1
]}

Hu, Gang ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun BUPT, Beijing Key Lab Spaceground Interconnect & Conver, Xitucheng Rd 10, Beijing 100876, Peoples R China

来源：

2023 IEEE 34TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, PIMRC | 2023年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Deep Neural Networks (DNNs); Edge Computing; Task Offloading; Early Exit; Model Partition;

D O I：

10.1109/PIMRC56721.2023.10293996

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

As the key technology of artificial intelligence(AI), Deep Neural Networks (DNNs) have been widely used in mobile applications, such as video analytics in autonomous driving. However, due to the constrained computation capabilities on mobile devices (MDs), it is challenging to meet the critical accuracy and real-time demand of DNN tasks, which would result in a serious drop in quality of service (QoS). A popular alternative is to offload DNN tasks to edges for intelligence inference, nevertheless, this results in a heavy communication burden due to large amounts of raw data. In this paper, we propose an adaptive DNN co-Inference (ADCI) strategy which obtains the flexible computation division among devices and edge servers with elastic execution by combining the early exit and model partition policies. Establishing a balanced utility function, we jointly optimize dynamic offloading and model adoption while taking into account the multi-user and multi-server edge computing system. To tackle the high coupling among mixed variables, we propose a two-stage deep reinforcement learning (DRL) algorithm. The early-exit and model partition decisions are tracked using the Lagrange method as a soft option. Results from simulations show that the ADCI strategy performs well with timely accuracy

引用

页数：6

共 50 条

[1] Collaborative Intelligence: Accelerating Deep Neural Network Inference via Device-Edge Synergy
Shan, Nanliang
Ye, Zecong
Cui, Xiaolong
SECURITY AND COMMUNICATION NETWORKS, 2020, 2020
[2] Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing
Li, En
Zeng, Liekang
Zhou, Zhi
Chen, Xu
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (01) : 447 - 457
[3] Hierarchical Deep Neural Network Inference for Device-Edge-Cloud Systems
Ilhan, Fatih
Tekin, Selim F.
Hu, Sihao
Huang, Tiansheng
Chow, Ka-Ho
Liu, Ling
COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 302 - 305
[4] Adaptive Deep Neural Network Ensemble for Inference-as-a-Service on Edge Computing Platforms
Bai, Yang
Chen, Lixing
Zhang, Letian
Abdel-Mottaleb, Mohamed
Xu, Jie
2021 IEEE 18TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SMART SYSTEMS (MASS 2021), 2021, : 27 - 35
[5] Learning-Based Edge-Device Collaborative DNN Inference in IoVT Networks
Xu, Xiaodong
Yan, Kaiwen
Han, Shujun
Wang, Bizhu
Tao, Xiaofeng
Zhang, Ping
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (05): : 7989 - 8004
[6] Energy-efficient cooperative inference via adaptive deep neural network splitting at the edge
Labriji, Ibtissam
Merluzzi, Mattia
Airod, Fatima Ezzahra
Strinati, Emilio Calvanese
ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1712 - 1717
[7] Adaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN
Zhang, Sai Qian
Lin, Jieyu
Zhang, Qi
PROCEEDINGS OF THE 49TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2020, 2020,
[8] Advancements in Accelerating Deep Neural Network Inference on AIoT Devices: A Survey
Cheng L.
Gu Y.
Liu Q.
Yang L.
Liu C.
Wang Y.
IEEE Transactions on Sustainable Computing, 2024, 9 (06): : 1 - 18
[9] Rethinking Pruning for Accelerating Deep Inference At the Edge
Gao, Dawei
He, Xiaoxi
Zhou, Zimu
Tong, Yongxin
Xu, Ke
Thiele, Lothar
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 155 - 164
[10] Accelerating Neural Network Inference With Processing-in-DRAM: From the Edge to the Cloud
Oliveira, Geraldo F.
Gomez-Luna, Juan
Ghose, Saugata
Boroumand, Amirali
Mutlu, Onur
IEEE MICRO, 2022, 42 (06) : 25 - 38

← 1 2 3 4 5 →