Accelerating Deep Neural Network Tasks Through Edge-Device Adaptive Inference

被引:0
|
作者
Zhang, Xinyang [1 ]
Teng, Yinglei [1 ]
Wang, Nan [1 ]
Sun, Boya [1 ]
Hu, Gang [1 ]
机构
[1] Beijing Univ Posts & Telecommun BUPT, Beijing Key Lab Spaceground Interconnect & Conver, Xitucheng Rd 10, Beijing 100876, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Deep Neural Networks (DNNs); Edge Computing; Task Offloading; Early Exit; Model Partition;
D O I
10.1109/PIMRC56721.2023.10293996
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As the key technology of artificial intelligence(AI), Deep Neural Networks (DNNs) have been widely used in mobile applications, such as video analytics in autonomous driving. However, due to the constrained computation capabilities on mobile devices (MDs), it is challenging to meet the critical accuracy and real-time demand of DNN tasks, which would result in a serious drop in quality of service (QoS). A popular alternative is to offload DNN tasks to edges for intelligence inference, nevertheless, this results in a heavy communication burden due to large amounts of raw data. In this paper, we propose an adaptive DNN co-Inference (ADCI) strategy which obtains the flexible computation division among devices and edge servers with elastic execution by combining the early exit and model partition policies. Establishing a balanced utility function, we jointly optimize dynamic offloading and model adoption while taking into account the multi-user and multi-server edge computing system. To tackle the high coupling among mixed variables, we propose a two-stage deep reinforcement learning (DRL) algorithm. The early-exit and model partition decisions are tracked using the Lagrange method as a soft option. Results from simulations show that the ADCI strategy performs well with timely accuracy
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Scheduling with Fully Compressible Tasks: Application to Deep Learning Inference with Neural Network Compression
    Barros, Tiago da Silva
    Giroire, Frederic
    Aparicio-Pardo, Ramon
    Perennes, Stephane
    Natale, Emanuele
    2024 IEEE 24TH INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID 2024, 2024, : 340 - 349
  • [22] Scaling for edge inference of deep neural networks
    Xiaowei Xu
    Yukun Ding
    Sharon Xiaobo Hu
    Michael Niemier
    Jason Cong
    Yu Hu
    Yiyu Shi
    Nature Electronics, 2018, 1 : 216 - 222
  • [23] Scaling for edge inference of deep neural networks
    Xu, Xiaowei
    Ding, Yukun
    Hu, Sharon Xiaobo
    Niemier, Michael
    Cong, Jason
    Hu, Yu
    Shi, Yiyu
    NATURE ELECTRONICS, 2018, 1 (04): : 216 - 222
  • [24] Memristive-based Mixed-signal CGRA for Accelerating Deep Neural Network Inference
    Kazerooni-Zand, Reza
    Kamal, Mehdi
    Afzali-Kusha, Ali
    Pedram, Massoud
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (04)
  • [25] Adaptive fuzzy inference neural network
    Iyatomi, H
    Hagiwara, M
    PATTERN RECOGNITION, 2004, 37 (10) : 2049 - 2057
  • [26] Accelerating Binarized Neural Network Inference by Reusing Operation Results and Elevating Resource Utilization on Edge devices
    Huang, Yu-Chang
    Tsai, You-Hsuen
    Li, Yi-Ting
    Chen, Yung-Chih
    Wang, Chun-Yao
    2023 INTERNATIONAL VLSI SYMPOSIUM ON TECHNOLOGY, SYSTEMS AND APPLICATIONS, VLSI-TSA/VLSI-DAT, 2023,
  • [27] DeepAdaIn-Net: Deep Adaptive Device-Edge Collaborative Inference for Augmented Reality
    Wang, Li
    Wu, Xin
    Zhang, Yi
    Zhang, Xinyun
    Xu, Lianming
    Wu, Zhihua
    Fei, Aiguo
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2023, 17 (05) : 1052 - 1063
  • [28] A Neural Network Approach to Edge Detection using Adaptive Neuro - Fuzzy Inference System
    Anwar, Shamama
    Raj, Sugandh
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 2432 - 2435
  • [29] Deep Learning Inference on Edge: A Preliminary Device Comparison
    Gonzalez, Manuel L.
    Ruiz, Jorge
    Andres, Lidia
    Lozada, Randy
    Skibinsky, Erik S.
    Fernandez, Jorge
    Sedano, Javier
    Garcia-Vico, Angel M.
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2024, PT I, 2025, 15346 : 265 - 276
  • [30] Dynamic resource allocation for jointing vehicle-edge deep neural network inference
    Wang, Qi
    Li, Zhiyong
    Nai, Ke
    Chen, Yifan
    Wen, Ming
    JOURNAL OF SYSTEMS ARCHITECTURE, 2021, 117