Distributed DNN Inference With Fine-Grained Model Partitioning in Mobile Edge Computing Networks

被引：5

作者：

Li, Hui ^{[1
]}

Li, Xiuhua ^{[1
]}

Fan, Qilin ^{[1
]}

He, Qiang ^{[2
]}

Wang, Xiaofei ^{[3
]}

Leung, Victor C. M. ^{[4
,5
]}

机构：

[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400000, Peoples R China

[2] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Sch Comp Sci & Technol, Serv Comp Technol & Syst Lab,Cluster & Grid Comp L, Wuhan 430074, Hubei, Peoples R China

[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300072, Peoples R China

[4] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China

[5] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T1Z4, Canada

来源：

IEEE TRANSACTIONS ON MOBILE COMPUTING | 2024年 / 23卷 / 10期

关键词：

Delays; Multitasking; Internet of Things; Artificial neural networks; Task analysis; Computational modeling; Inference algorithms; Asynchronous advantage actor-critic; distributed DNN inference; mobile edge computing; model partitioning; multi-task learning; RESOURCE-ALLOCATION; AWARE; ACCELERATION;

D O I：

10.1109/TMC.2024.3357874

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Model partitioning is a promising technique for improving the efficiency of distributed inference by executing partial deep neural network (DNN) models on edge servers (ESs) or Internet-of-Things (IoT) devices. However, due to heterogeneous resources of ESs and IoT devices in mobile edge computing (MEC) networks, it is non-trivial to guarantee the DNN inference speed to satisfy specific delay constraints. Meanwhile, many existing DNN models have a deep and complex architecture with numerous DNN blocks, which leads to a huge search space for fine-grained model partitioning. To address these challenges, we investigate distributed DNN inference with fine-grained model partitioning, with collaborations between ESs and IoT devices. We formulate the problem and propose a multi-task learning based asynchronous advantage actor-critic approach to find a competitive model partitioning policy that reduces DNN inference delay. Specifically, we combine the shared layers of actor-network and critic-network via soft parameter sharing, and expand the output layer into multiple branches to determine the model partitioning policy for each DNN block individually. Experiment results demonstrate that the proposed approach outperforms state-of-the-art approaches by reducing total inference delay, edge inference delay and local inference delay by an average of 4.76%, 10.04% and 8.03% in the considered MEC networks.

引用

页码：9060 / 9074

页数：15

共 50 条

[41] Fine-grained data access control for distributed sensor networks
Hur, Junbeom
WIRELESS NETWORKS, 2011, 17 (05) : 1235 - 1249
[42] Video deblocking with fine-grained scalable complexity for embedded mobile computing
Yu, ZH
Zhang, J
2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 1173 - 1178
[43] Fine-Grained Task-Dependency Offloading in Mobile Cloud Computing
Pan, Shengli
Liu, Chun
Zeng, Deze
Yao, Hong
Qian, Zhuzhong
2018 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2018, : 977 - 982
[44] Exploring Fine-Grained Sparsity in Convolutional Neural Networks for Efficient Inference
Wang, Longguang
Guo, Yulan
Dong, Xiaoyu
Wang, Yingqian
Ying, Xinyi
Lin, Zaiping
An, Wei
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4474 - 4493
[45] OfpCNN: On-Demand Fine-Grained Partitioning for CNN Inference Acceleration in Heterogeneous Devices
Yang, Lei
Zheng, Can
Shen, Xiaoyuan
Xie, Guoqi
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (12) : 3090 - 3103
[46] Joint Model, Task Partitioning and Privacy Preserving Adaptation for Edge DNN Inference
Jiang, Jingran
Li, Hongjia
Wang, Liming
2022 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2022, : 1224 - 1229
[47] Joint multi-user DNN partitioning and task offloading in mobile edge computing
Liao, Zhuofan
Hu, Weibo
Huang, Jiawei
Wang, Jianxin
AD HOC NETWORKS, 2023, 144
[48] Poster Abstract: Generative Model Based Fine-Grained Air Pollution Inference for Mobile Sensing Systems
Ma, Rui
Xu, Xiangxiang
Noh, Hae Young
Zhang, Pei
Zhang, Lin
SENSYS'18: PROCEEDINGS OF THE 16TH CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, 2018, : 426 - 427
[49] Distributed Inference Acceleration with Adaptive DNN Partitioning and Offloading
Mohammed, Thaha
Joe-Wong, Carlee
Babbar, Rohit
Di Francesco, Mario
IEEE INFOCOM 2020 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, 2020, : 854 - 863
[50] A Mobile DNN Training Processor With Automatic Bit Precision Search and Fine-Grained Sparsity Exploitation
Han, Donghyeon
Im, Dongseok
Park, Gwangtae
Kim, Youngwoo
Song, Seokchan
Lee, Juhyoung
Yoo, Hoi-Jun
IEEE MICRO, 2022, 42 (02) : 16 - 24

← 1 2 3 4 5 →