Distributed DNN Inference With Fine-Grained Model Partitioning in Mobile Edge Computing Networks

被引:5
|
作者
Li, Hui [1 ]
Li, Xiuhua [1 ]
Fan, Qilin [1 ]
He, Qiang [2 ]
Wang, Xiaofei [3 ]
Leung, Victor C. M. [4 ,5 ]
机构
[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400000, Peoples R China
[2] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Sch Comp Sci & Technol, Serv Comp Technol & Syst Lab,Cluster & Grid Comp L, Wuhan 430074, Hubei, Peoples R China
[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300072, Peoples R China
[4] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China
[5] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T1Z4, Canada
关键词
Delays; Multitasking; Internet of Things; Artificial neural networks; Task analysis; Computational modeling; Inference algorithms; Asynchronous advantage actor-critic; distributed DNN inference; mobile edge computing; model partitioning; multi-task learning; RESOURCE-ALLOCATION; AWARE; ACCELERATION;
D O I
10.1109/TMC.2024.3357874
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Model partitioning is a promising technique for improving the efficiency of distributed inference by executing partial deep neural network (DNN) models on edge servers (ESs) or Internet-of-Things (IoT) devices. However, due to heterogeneous resources of ESs and IoT devices in mobile edge computing (MEC) networks, it is non-trivial to guarantee the DNN inference speed to satisfy specific delay constraints. Meanwhile, many existing DNN models have a deep and complex architecture with numerous DNN blocks, which leads to a huge search space for fine-grained model partitioning. To address these challenges, we investigate distributed DNN inference with fine-grained model partitioning, with collaborations between ESs and IoT devices. We formulate the problem and propose a multi-task learning based asynchronous advantage actor-critic approach to find a competitive model partitioning policy that reduces DNN inference delay. Specifically, we combine the shared layers of actor-network and critic-network via soft parameter sharing, and expand the output layer into multiple branches to determine the model partitioning policy for each DNN block individually. Experiment results demonstrate that the proposed approach outperforms state-of-the-art approaches by reducing total inference delay, edge inference delay and local inference delay by an average of 4.76%, 10.04% and 8.03% in the considered MEC networks.
引用
收藏
页码:9060 / 9074
页数:15
相关论文
共 50 条
  • [41] Fine-grained data access control for distributed sensor networks
    Hur, Junbeom
    WIRELESS NETWORKS, 2011, 17 (05) : 1235 - 1249
  • [42] Video deblocking with fine-grained scalable complexity for embedded mobile computing
    Yu, ZH
    Zhang, J
    2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 1173 - 1178
  • [43] Fine-Grained Task-Dependency Offloading in Mobile Cloud Computing
    Pan, Shengli
    Liu, Chun
    Zeng, Deze
    Yao, Hong
    Qian, Zhuzhong
    2018 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2018, : 977 - 982
  • [44] Exploring Fine-Grained Sparsity in Convolutional Neural Networks for Efficient Inference
    Wang, Longguang
    Guo, Yulan
    Dong, Xiaoyu
    Wang, Yingqian
    Ying, Xinyi
    Lin, Zaiping
    An, Wei
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4474 - 4493
  • [45] OfpCNN: On-Demand Fine-Grained Partitioning for CNN Inference Acceleration in Heterogeneous Devices
    Yang, Lei
    Zheng, Can
    Shen, Xiaoyuan
    Xie, Guoqi
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (12) : 3090 - 3103
  • [46] Joint Model, Task Partitioning and Privacy Preserving Adaptation for Edge DNN Inference
    Jiang, Jingran
    Li, Hongjia
    Wang, Liming
    2022 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2022, : 1224 - 1229
  • [47] Joint multi-user DNN partitioning and task offloading in mobile edge computing
    Liao, Zhuofan
    Hu, Weibo
    Huang, Jiawei
    Wang, Jianxin
    AD HOC NETWORKS, 2023, 144
  • [48] Poster Abstract: Generative Model Based Fine-Grained Air Pollution Inference for Mobile Sensing Systems
    Ma, Rui
    Xu, Xiangxiang
    Noh, Hae Young
    Zhang, Pei
    Zhang, Lin
    SENSYS'18: PROCEEDINGS OF THE 16TH CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, 2018, : 426 - 427
  • [49] Distributed Inference Acceleration with Adaptive DNN Partitioning and Offloading
    Mohammed, Thaha
    Joe-Wong, Carlee
    Babbar, Rohit
    Di Francesco, Mario
    IEEE INFOCOM 2020 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, 2020, : 854 - 863
  • [50] A Mobile DNN Training Processor With Automatic Bit Precision Search and Fine-Grained Sparsity Exploitation
    Han, Donghyeon
    Im, Dongseok
    Park, Gwangtae
    Kim, Youngwoo
    Song, Seokchan
    Lee, Juhyoung
    Yoo, Hoi-Jun
    IEEE MICRO, 2022, 42 (02) : 16 - 24