Multi-Stream Scheduling of Inference Pipelines on Edge Devices - a DRL Approach

被引:0
|
作者
Pereira, Danny [1 ]
Ghosh, Sumana [2 ]
Dey, Soumyajit [1 ]
机构
[1] Indian Inst Technol Kharagpur, Comp Sci & Engn, Kharagpur, West Bengal, India
[2] Indian Stat Inst, Comp & Commun Sci Div, Kolkata, West Bengal, India
关键词
Convolutional neural network; edge device; GPU; deep reinforcement learning; real-time scheduling;
D O I
10.1145/3677378
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Low-power edge devices equipped with Graphics Processing Units (GPUs) are a popular target platform for real-time scheduling of inference pipelines. Such application-architecture combinations are popular in Advanced Driver-assistance Systems for aiding in the real-time decision-making of automotive controllers. However, the real-time throughput sustainable by such inference pipelines is limited by resource constraints of the target edge devices. Modern GPUs, both in edge devices and workstation variants, support the facility of concurrent execution of computation kernels and data transfers using the primitive of streams, also allowing for the assignment of priority to these streams. This opens up the possibility of executing computation layers of inference pipelines within a multi-priority, multi-stream environment on the GPU. However, manually co-scheduling such applications while satisfying their throughput requirement and platform memory budget may require an unmanageable number of profiling runs. In this work, we propose a Deep Reinforcement Learning (DRL)-based method for deciding the start time of various operations in each pipeline layer while optimizing the latency of execution of inference pipelines as well as memory consumption. Experimental results demonstrate the promising efficacy of the proposed DRL approach in comparison with the baseline methods, particularly in terms of real-time performance enhancements, schedulability ratio, and memory savings. We have additionally assessed the effectiveness of the proposed DRL approach using a real-time traffic simulation tool IPG CarMaker.
引用
收藏
页数:36
相关论文
共 50 条
  • [21] Joint strong edge and multi-stream adaptive fusion network for non-uniform image deblurring
    Li, Zihan
    Cui, Guangmang
    Zhao, Jufeng
    Xiang, Qinlei
    He, Bintao
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 89
  • [22] Joint Charging Scheduling and Computation Offloading in EV-Assisted Edge Computing: A Safe DRL Approach
    Zhang, Yongchao
    Hu, Jia
    Min, Geyong
    Chen, Xin
    Georgalas, Nektarios
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (09) : 8757 - 8772
  • [23] An Adaptive Task Migration Scheduling Approach for Edge-Cloud Collaborative Inference
    Zhang, Boyin
    Li, Yinggang
    Zhang, Shigeng
    Zhang, Yue
    Zhu, Bing
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [24] A new multi-stream approach using acoustic and visual features for robust speech recognition system
    Radha, N.
    Shahina, A.
    Khan, A. Nayeemulla
    Velusami, Jansi Rani Sella
    MATERIALS TODAY-PROCEEDINGS, 2022, 62 : 4916 - 4924
  • [25] Coordinated Load Balancing in Mobile Edge Computing Network: a Multi-Agent DRL Approach
    Ma, Manyou
    Wu, Di
    Xu, Yi Tian
    Li, Jimmy
    Jang, Seowoo
    Liu, Xue
    Dudek, Gregory
    IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, : 619 - 624
  • [26] Improving QoE of Deep Neural Network Inference on Edge Devices: A Bandit Approach
    Lu, Bingqian
    Yang, Jianyi
    Xu, Jie
    Ren, Shaolei
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (21) : 21409 - 21420
  • [27] POS: An Operator Scheduling Framework for Multi-model Inference on Edge Intelligent Computing
    Zhang, Ziyang
    Li, Huan
    Zhao, Yang
    Lin, Changyao
    Liu, Jie
    PROCEEDINGS OF THE 2023 THE 22ND INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING IN SENSOR NETWORKS, IPSN 2023, 2023, : 40 - 52
  • [28] MMG-net: Multi modal approach to estimate blood glucose using multi-stream and cross modality attention
    Chowdhury, Moajjem Hossain
    Chowdhury, Muhammad E. H.
    Alqahtani, Abdulrahman
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 92
  • [29] Multi-stream Adaptive Offloading of Joint Compressed Video Streams, Feature Streams, and Semantic Streams in Edge Computing Systems
    Hu, Dieli
    Ji, Wen
    Wang, Zhi
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 996 - 1001
  • [30] Hastening Stream Offloading of Inference via Multi-Exit DNNs in Mobile Edge Computing
    Liu, Zhicheng
    Song, Jinduo
    Qiu, Chao
    Wang, Xiaofei
    Chen, Xu
    He, Qiang
    Sheng, Hao
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (01) : 535 - 548