Distributed DNN Inference With Fine-Grained Model Partitioning in Mobile Edge Computing Networks

被引：5

作者：

Li, Hui ^{[1
]}

Li, Xiuhua ^{[1
]}

Fan, Qilin ^{[1
]}

He, Qiang ^{[2
]}

Wang, Xiaofei ^{[3
]}

Leung, Victor C. M. ^{[4
,5
]}

机构：

[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400000, Peoples R China

[2] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Sch Comp Sci & Technol, Serv Comp Technol & Syst Lab,Cluster & Grid Comp L, Wuhan 430074, Hubei, Peoples R China

[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300072, Peoples R China

[4] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China

[5] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T1Z4, Canada

来源：

IEEE TRANSACTIONS ON MOBILE COMPUTING | 2024年 / 23卷 / 10期

关键词：

Delays; Multitasking; Internet of Things; Artificial neural networks; Task analysis; Computational modeling; Inference algorithms; Asynchronous advantage actor-critic; distributed DNN inference; mobile edge computing; model partitioning; multi-task learning; RESOURCE-ALLOCATION; AWARE; ACCELERATION;

D O I：

10.1109/TMC.2024.3357874

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Model partitioning is a promising technique for improving the efficiency of distributed inference by executing partial deep neural network (DNN) models on edge servers (ESs) or Internet-of-Things (IoT) devices. However, due to heterogeneous resources of ESs and IoT devices in mobile edge computing (MEC) networks, it is non-trivial to guarantee the DNN inference speed to satisfy specific delay constraints. Meanwhile, many existing DNN models have a deep and complex architecture with numerous DNN blocks, which leads to a huge search space for fine-grained model partitioning. To address these challenges, we investigate distributed DNN inference with fine-grained model partitioning, with collaborations between ESs and IoT devices. We formulate the problem and propose a multi-task learning based asynchronous advantage actor-critic approach to find a competitive model partitioning policy that reduces DNN inference delay. Specifically, we combine the shared layers of actor-network and critic-network via soft parameter sharing, and expand the output layer into multiple branches to determine the model partitioning policy for each DNN block individually. Experiment results demonstrate that the proposed approach outperforms state-of-the-art approaches by reducing total inference delay, edge inference delay and local inference delay by an average of 4.76%, 10.04% and 8.03% in the considered MEC networks.

引用

页码：9060 / 9074

页数：15

共 50 条

[21] Verifiable Data Search with Fine-Grained Authorization in Edge Computing
Li, Jianwei
Wang, Xiaoming
Gan, Qingqing
SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
[22] DNN Placement and Inference in Edge Computing
Bensalem, Mounir
Dizdarevic, Jasenka
Jukan, Admela
2020 43RD INTERNATIONAL CONVENTION ON INFORMATION, COMMUNICATION AND ELECTRONIC TECHNOLOGY (MIPRO 2020), 2020, : 479 - 484
[23] Serving DNN Inference With Fine-Grained Spatio-Temporal Sharing of GPU Servers
Peng, Yaqiong
Gao, Weiguo
Peng, Haocheng
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (06) : 4310 - 4323
[24] Fine-Grained Data Sharing in Cloud Computing for Mobile Devices
Shao, Jun
Lu, Rongxing
Lin, Xiaodong
2015 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (INFOCOM), 2015,
[25] Fine-Grained Partitioning and Reorganization Deployment Strategy of Edge Video Processing
Qin J.
Shi C.-W.
Zhang Y.
Jia Y.-J.
Hu H.-X.
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2021, 49 (11): : 2152 - 2159
[26] CheckFreq: Frequent, Fine-Grained DNN Checkpointing
Mohan, Jayashree
Phanishayee, Amar
Chidambaram, Vijay
PROCEEDINGS OF THE 19TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES (FAST '21), 2021, : 203 - 216
[27] A Distributed Hidden Markov Model for Fine-grained Annotation in Body Sensor Networks
Guenterberg, Eric
Ghasemzadeh, Hassan
Jafari, Roozbeh
SIXTH INTERNATIONAL WORKSHOP ON WEARABLE AND IMPLANTABLE BODY SENSOR NETWORKS, PROCEEDINGS, 2009, : 339 - 344
[28] DNN Partitioning for Inference Throughput Acceleration at the Edge
Feltin, Thomas
Marcho, Leo
Cordero-Fuertes, Juan-Antonio
Brockners, Frank
Clausen, Thomas H.
IEEE ACCESS, 2023, 11 : 52236 - 52249
[29] Fine-grained Hardware Acceleration for Efficient Batteryless Intermittent Inference on the Edge
Caronti, Luca
Akhunov, Khakim
Nardello, Matteo
Yildirim, Kasim Sinan
Brunelli, Davide
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (05)
[30] Deadline aware and energy-efficient scheduling algorithm for fine-grained tasks in mobile edge computing
Lakhan, Abdullah
Mohammed, Mazin Abed
Rashid, Ahmed N.
Kadry, Seifedine
Abdulkareem, Karrar Hameed
INTERNATIONAL JOURNAL OF WEB AND GRID SERVICES, 2022, 18 (02) : 168 - 193

← 1 2 3 4 5 →