Distributed DNN Inference With Fine-Grained Model Partitioning in Mobile Edge Computing Networks

被引:5
|
作者
Li, Hui [1 ]
Li, Xiuhua [1 ]
Fan, Qilin [1 ]
He, Qiang [2 ]
Wang, Xiaofei [3 ]
Leung, Victor C. M. [4 ,5 ]
机构
[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400000, Peoples R China
[2] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Sch Comp Sci & Technol, Serv Comp Technol & Syst Lab,Cluster & Grid Comp L, Wuhan 430074, Hubei, Peoples R China
[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300072, Peoples R China
[4] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China
[5] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T1Z4, Canada
关键词
Delays; Multitasking; Internet of Things; Artificial neural networks; Task analysis; Computational modeling; Inference algorithms; Asynchronous advantage actor-critic; distributed DNN inference; mobile edge computing; model partitioning; multi-task learning; RESOURCE-ALLOCATION; AWARE; ACCELERATION;
D O I
10.1109/TMC.2024.3357874
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Model partitioning is a promising technique for improving the efficiency of distributed inference by executing partial deep neural network (DNN) models on edge servers (ESs) or Internet-of-Things (IoT) devices. However, due to heterogeneous resources of ESs and IoT devices in mobile edge computing (MEC) networks, it is non-trivial to guarantee the DNN inference speed to satisfy specific delay constraints. Meanwhile, many existing DNN models have a deep and complex architecture with numerous DNN blocks, which leads to a huge search space for fine-grained model partitioning. To address these challenges, we investigate distributed DNN inference with fine-grained model partitioning, with collaborations between ESs and IoT devices. We formulate the problem and propose a multi-task learning based asynchronous advantage actor-critic approach to find a competitive model partitioning policy that reduces DNN inference delay. Specifically, we combine the shared layers of actor-network and critic-network via soft parameter sharing, and expand the output layer into multiple branches to determine the model partitioning policy for each DNN block individually. Experiment results demonstrate that the proposed approach outperforms state-of-the-art approaches by reducing total inference delay, edge inference delay and local inference delay by an average of 4.76%, 10.04% and 8.03% in the considered MEC networks.
引用
收藏
页码:9060 / 9074
页数:15
相关论文
共 50 条
  • [21] Verifiable Data Search with Fine-Grained Authorization in Edge Computing
    Li, Jianwei
    Wang, Xiaoming
    Gan, Qingqing
    SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
  • [22] DNN Placement and Inference in Edge Computing
    Bensalem, Mounir
    Dizdarevic, Jasenka
    Jukan, Admela
    2020 43RD INTERNATIONAL CONVENTION ON INFORMATION, COMMUNICATION AND ELECTRONIC TECHNOLOGY (MIPRO 2020), 2020, : 479 - 484
  • [23] Serving DNN Inference With Fine-Grained Spatio-Temporal Sharing of GPU Servers
    Peng, Yaqiong
    Gao, Weiguo
    Peng, Haocheng
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (06) : 4310 - 4323
  • [24] Fine-Grained Data Sharing in Cloud Computing for Mobile Devices
    Shao, Jun
    Lu, Rongxing
    Lin, Xiaodong
    2015 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (INFOCOM), 2015,
  • [25] Fine-Grained Partitioning and Reorganization Deployment Strategy of Edge Video Processing
    Qin J.
    Shi C.-W.
    Zhang Y.
    Jia Y.-J.
    Hu H.-X.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2021, 49 (11): : 2152 - 2159
  • [26] CheckFreq: Frequent, Fine-Grained DNN Checkpointing
    Mohan, Jayashree
    Phanishayee, Amar
    Chidambaram, Vijay
    PROCEEDINGS OF THE 19TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES (FAST '21), 2021, : 203 - 216
  • [27] A Distributed Hidden Markov Model for Fine-grained Annotation in Body Sensor Networks
    Guenterberg, Eric
    Ghasemzadeh, Hassan
    Jafari, Roozbeh
    SIXTH INTERNATIONAL WORKSHOP ON WEARABLE AND IMPLANTABLE BODY SENSOR NETWORKS, PROCEEDINGS, 2009, : 339 - 344
  • [28] DNN Partitioning for Inference Throughput Acceleration at the Edge
    Feltin, Thomas
    Marcho, Leo
    Cordero-Fuertes, Juan-Antonio
    Brockners, Frank
    Clausen, Thomas H.
    IEEE ACCESS, 2023, 11 : 52236 - 52249
  • [29] Fine-grained Hardware Acceleration for Efficient Batteryless Intermittent Inference on the Edge
    Caronti, Luca
    Akhunov, Khakim
    Nardello, Matteo
    Yildirim, Kasim Sinan
    Brunelli, Davide
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (05)
  • [30] Deadline aware and energy-efficient scheduling algorithm for fine-grained tasks in mobile edge computing
    Lakhan, Abdullah
    Mohammed, Mazin Abed
    Rashid, Ahmed N.
    Kadry, Seifedine
    Abdulkareem, Karrar Hameed
    INTERNATIONAL JOURNAL OF WEB AND GRID SERVICES, 2022, 18 (02) : 168 - 193