Accelerating Deep Neural Network Tasks Through Edge-Device Adaptive Inference

被引:0
|
作者
Zhang, Xinyang [1 ]
Teng, Yinglei [1 ]
Wang, Nan [1 ]
Sun, Boya [1 ]
Hu, Gang [1 ]
机构
[1] Beijing Univ Posts & Telecommun BUPT, Beijing Key Lab Spaceground Interconnect & Conver, Xitucheng Rd 10, Beijing 100876, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Deep Neural Networks (DNNs); Edge Computing; Task Offloading; Early Exit; Model Partition;
D O I
10.1109/PIMRC56721.2023.10293996
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As the key technology of artificial intelligence(AI), Deep Neural Networks (DNNs) have been widely used in mobile applications, such as video analytics in autonomous driving. However, due to the constrained computation capabilities on mobile devices (MDs), it is challenging to meet the critical accuracy and real-time demand of DNN tasks, which would result in a serious drop in quality of service (QoS). A popular alternative is to offload DNN tasks to edges for intelligence inference, nevertheless, this results in a heavy communication burden due to large amounts of raw data. In this paper, we propose an adaptive DNN co-Inference (ADCI) strategy which obtains the flexible computation division among devices and edge servers with elastic execution by combining the early exit and model partition policies. Establishing a balanced utility function, we jointly optimize dynamic offloading and model adoption while taking into account the multi-user and multi-server edge computing system. To tackle the high coupling among mixed variables, we propose a two-stage deep reinforcement learning (DRL) algorithm. The early-exit and model partition decisions are tracked using the Lagrange method as a soft option. Results from simulations show that the ADCI strategy performs well with timely accuracy
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Collaborative Intelligence: Accelerating Deep Neural Network Inference via Device-Edge Synergy
    Shan, Nanliang
    Ye, Zecong
    Cui, Xiaolong
    SECURITY AND COMMUNICATION NETWORKS, 2020, 2020
  • [2] Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing
    Li, En
    Zeng, Liekang
    Zhou, Zhi
    Chen, Xu
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (01) : 447 - 457
  • [3] Hierarchical Deep Neural Network Inference for Device-Edge-Cloud Systems
    Ilhan, Fatih
    Tekin, Selim F.
    Hu, Sihao
    Huang, Tiansheng
    Chow, Ka-Ho
    Liu, Ling
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 302 - 305
  • [4] Adaptive Deep Neural Network Ensemble for Inference-as-a-Service on Edge Computing Platforms
    Bai, Yang
    Chen, Lixing
    Zhang, Letian
    Abdel-Mottaleb, Mohamed
    Xu, Jie
    2021 IEEE 18TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SMART SYSTEMS (MASS 2021), 2021, : 27 - 35
  • [5] Learning-Based Edge-Device Collaborative DNN Inference in IoVT Networks
    Xu, Xiaodong
    Yan, Kaiwen
    Han, Shujun
    Wang, Bizhu
    Tao, Xiaofeng
    Zhang, Ping
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (05): : 7989 - 8004
  • [6] Energy-efficient cooperative inference via adaptive deep neural network splitting at the edge
    Labriji, Ibtissam
    Merluzzi, Mattia
    Airod, Fatima Ezzahra
    Strinati, Emilio Calvanese
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1712 - 1717
  • [7] Adaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN
    Zhang, Sai Qian
    Lin, Jieyu
    Zhang, Qi
    PROCEEDINGS OF THE 49TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2020, 2020,
  • [8] Advancements in Accelerating Deep Neural Network Inference on AIoT Devices: A Survey
    Cheng L.
    Gu Y.
    Liu Q.
    Yang L.
    Liu C.
    Wang Y.
    IEEE Transactions on Sustainable Computing, 2024, 9 (06): : 1 - 18
  • [9] Rethinking Pruning for Accelerating Deep Inference At the Edge
    Gao, Dawei
    He, Xiaoxi
    Zhou, Zimu
    Tong, Yongxin
    Xu, Ke
    Thiele, Lothar
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 155 - 164
  • [10] Accelerating Neural Network Inference With Processing-in-DRAM: From the Edge to the Cloud
    Oliveira, Geraldo F.
    Gomez-Luna, Juan
    Ghose, Saugata
    Boroumand, Amirali
    Mutlu, Onur
    IEEE MICRO, 2022, 42 (06) : 25 - 38