Multi-Stream Scheduling of Inference Pipelines on Edge Devices - a DRL Approach

被引：0

作者：

Pereira, Danny ^{[1
]}

Ghosh, Sumana ^{[2
]}

Dey, Soumyajit ^{[1
]}

机构：

[1] Indian Inst Technol Kharagpur, Comp Sci & Engn, Kharagpur, West Bengal, India

[2] Indian Stat Inst, Comp & Commun Sci Div, Kolkata, West Bengal, India

来源：

ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS | 2024年 / 29卷 / 06期

关键词：

Convolutional neural network; edge device; GPU; deep reinforcement learning; real-time scheduling;

D O I：

10.1145/3677378

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Low-power edge devices equipped with Graphics Processing Units (GPUs) are a popular target platform for real-time scheduling of inference pipelines. Such application-architecture combinations are popular in Advanced Driver-assistance Systems for aiding in the real-time decision-making of automotive controllers. However, the real-time throughput sustainable by such inference pipelines is limited by resource constraints of the target edge devices. Modern GPUs, both in edge devices and workstation variants, support the facility of concurrent execution of computation kernels and data transfers using the primitive of streams, also allowing for the assignment of priority to these streams. This opens up the possibility of executing computation layers of inference pipelines within a multi-priority, multi-stream environment on the GPU. However, manually co-scheduling such applications while satisfying their throughput requirement and platform memory budget may require an unmanageable number of profiling runs. In this work, we propose a Deep Reinforcement Learning (DRL)-based method for deciding the start time of various operations in each pipeline layer while optimizing the latency of execution of inference pipelines as well as memory consumption. Experimental results demonstrate the promising efficacy of the proposed DRL approach in comparison with the baseline methods, particularly in terms of real-time performance enhancements, schedulability ratio, and memory savings. We have additionally assessed the effectiveness of the proposed DRL approach using a real-time traffic simulation tool IPG CarMaker.

引用

页数：36

共 50 条

[41] Pathological voice classification system based on CNN-BiLSTM network using speech enhancement and multi-stream approach
Belabbas S.
Addou D.
Selouani S.A.
International Journal of Speech Technology, 2024, 27 (02) : 483 - 502
[42] Ace-Sniper: Cloud-Edge Collaborative Scheduling Framework With DNN Inference Latency Modeling on Heterogeneous Devices
Liu, Weihong
Geng, Jiawei
Zhu, Zongwei
Zhao, Yang
Ji, Cheng
Li, Changlong
Lian, Zirui
Zhou, Xuehai
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (02) : 534 - 547
[43] Reaching for the Sky: Maximizing Deep Learning Inference Throughput on Edge Devices with AI Multi-Tenancy
Hao, Jianwei
Subedi, Piyush
Ramaswamy, Lakshmish
Kim, In Kee
ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2023, 23 (01)
[44] Inferencing on Edge Devices: A Time- and Space-aware Co-scheduling Approach
Pereira, Danny
Ghose, Anirban
Ghosh, Sumana
Dey, Soumyajit
ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (03)
[45] Graph Tasks Offloading and Resource Allocation in Multi-Access Edge Computing: A DRL-and-Optimization-Aided Approach
Li, Jinming
Gu, Bo
Qin, Zhen
Han, Yu
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2023, 10 (06): : 3707 - 3718
[46] Multi-task scheduling in vehicular edge computing: a multi-agent reinforcement learning approach
Zhao, Yiming
Mo, Lei
Liu, Ji
CCF TRANSACTIONS ON PERVASIVE COMPUTING AND INTERACTION, 2024, 6 (04) : 348 - 364
[47] MARS: A DRL-Based Multi-Task Resource Scheduling Framework for UAV With IRS-Assisted Mobile Edge Computing System
Jiang, Feibo
Peng, Yubo
Wang, Kezhi
Dong, Li
Yang, Kun
IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (04) : 3700 - 3712
[48] A Deep Reinforcement Learning Approach to Multi-component Job Scheduling in Edge Computing
Cao, Zhi
Zhang, Honggang
Cao, Yu
Liu, Benyuan
2019 15TH INTERNATIONAL CONFERENCE ON MOBILE AD-HOC AND SENSOR NETWORKS (MSN 2019), 2019, : 19 - 24
[49] Imbalance Cost-Aware Energy Scheduling for Prosumers Towards UAM Charging: A Matching and Multi-Agent DRL Approach
Zou, Luyao
Munir, Md. Shirajum
Hassan, Sheikh Salman
Tun, Yan Kyaw
Nguyen, Loc X.
Hong, Choong Seon
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (03) : 3404 - 3420
[50] Multi-agent DRL-based data-driven approach for PEVs charging/discharging scheduling in smart grid
Wan, Yanni
Qin, Jiahu
Ma, Qichao
Fu, Weiming
Wang, Shi
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2022, 359 (04): : 1747 - 1767

← 1 2 3 4 5 →