Distributed DNN Inference With Fine-Grained Model Partitioning in Mobile Edge Computing Networks

被引：5

作者：

Li, Hui ^{[1
]}

Li, Xiuhua ^{[1
]}

Fan, Qilin ^{[1
]}

He, Qiang ^{[2
]}

Wang, Xiaofei ^{[3
]}

Leung, Victor C. M. ^{[4
,5
]}

机构：

[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400000, Peoples R China

[2] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Sch Comp Sci & Technol, Serv Comp Technol & Syst Lab,Cluster & Grid Comp L, Wuhan 430074, Hubei, Peoples R China

[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300072, Peoples R China

[4] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China

[5] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T1Z4, Canada

来源：

IEEE TRANSACTIONS ON MOBILE COMPUTING | 2024年 / 23卷 / 10期

关键词：

Delays; Multitasking; Internet of Things; Artificial neural networks; Task analysis; Computational modeling; Inference algorithms; Asynchronous advantage actor-critic; distributed DNN inference; mobile edge computing; model partitioning; multi-task learning; RESOURCE-ALLOCATION; AWARE; ACCELERATION;

D O I：

10.1109/TMC.2024.3357874

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Model partitioning is a promising technique for improving the efficiency of distributed inference by executing partial deep neural network (DNN) models on edge servers (ESs) or Internet-of-Things (IoT) devices. However, due to heterogeneous resources of ESs and IoT devices in mobile edge computing (MEC) networks, it is non-trivial to guarantee the DNN inference speed to satisfy specific delay constraints. Meanwhile, many existing DNN models have a deep and complex architecture with numerous DNN blocks, which leads to a huge search space for fine-grained model partitioning. To address these challenges, we investigate distributed DNN inference with fine-grained model partitioning, with collaborations between ESs and IoT devices. We formulate the problem and propose a multi-task learning based asynchronous advantage actor-critic approach to find a competitive model partitioning policy that reduces DNN inference delay. Specifically, we combine the shared layers of actor-network and critic-network via soft parameter sharing, and expand the output layer into multiple branches to determine the model partitioning policy for each DNN block individually. Experiment results demonstrate that the proposed approach outperforms state-of-the-art approaches by reducing total inference delay, edge inference delay and local inference delay by an average of 4.76%, 10.04% and 8.03% in the considered MEC networks.

引用

页码：9060 / 9074

页数：15

共 50 条

[31] A Fine-Grained Performance Model of Cloud Computing Centers
Khazaei, Hamzeh
Misic, Jelena
Misic, Vojislav B.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2013, 24 (11) : 2138 - 2147
[32] A fine-grained parallel programming model for grid computing
Yang, GW
Wang, Q
Wu, YW
Huang, DZ
2004 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING, PROCEEDINGS, 2004, : 613 - 616
[33] Fine-grained Program Partitioning for Security
Huang, Zhen
Jaeger, Trent
Tan, Gang
PROCEEDINGS OF THE 14TH EUROPEAN WORKSHOP ON SYSTEMS SECURITY (EUROSEC 2021), 2021, : 21 - 26
[34] Partitioning Techniques for Fine-grained Indexing
Wu, Eugene
Madden, Samuel
IEEE 27TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2011), 2011, : 1127 - 1138
[35] DNN Real-Time Collaborative Inference Acceleration with Mobile Edge Computing
Yang, Run
Li, Yan
He, Hui
Zhang, Weizhe
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[36] Joint DNN partitioning and resource allocation for completion rate maximization of delay-aware DNN inference tasks in wireless powered mobile edge computing
Xianzhong Tian
Pengcheng Xu
Yifan Shen
Yuheng Shao
Peer-to-Peer Networking and Applications, 2023, 16 (6) : 2865 - 2878
[37] Fine-Grained Urban Flow Inference
Ouyang, Kun
Liang, Yuxuan
Liu, Ye
Tong, Zekun
Ruan, Sijie
Zheng, Yu
Rosenblum, David S.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (06) : 2755 - 2770
[38] DNN Surgery: Accelerating DNN Inference on the Edge Through Layer Partitioning
Liang, Huanghuang
Sang, Qianlong
Hu, Chuang
Cheng, Dazhao
Zhou, Xiaobo
Wang, Dan
Bao, Wei
Wang, Yu
IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (03) : 3111 - 3125
[39] Joint DNN partitioning and resource allocation for completion rate maximization of delay-aware DNN inference tasks in wireless powered mobile edge computing
Tian, Xianzhong
Xu, Pengcheng
Shen, Yifan
Shao, Yuheng
PEER-TO-PEER NETWORKING AND APPLICATIONS, 2023, 16 (06) : 2865 - 2878
[40] Fine-grained data access control for distributed sensor networks
Junbeom Hur
Wireless Networks, 2011, 17 : 1235 - 1249

← 1 2 3 4 5 →