SpotDAG: An RL-Based Algorithm for DAG Workflow Scheduling in Heterogeneous Cloud Environments

被引:2
|
作者
Lin, Liduo [1 ]
Pan, Li [1 ]
Liu, Shijun [1 ]
机构
[1] Shandong Univ, Sch Software, Jinan, Peoples R China
基金
国家重点研发计划;
关键词
Data processing; Job shop scheduling; Costs; Cloud computing; Optimization; Task analysis; Data models; Heterogeneous cloud environments; spot instance; on-demand instance; IaaS; TASKS;
D O I
10.1109/TSC.2024.3422828
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As increasingly complex functions are implemented in applications, directed acyclic graphs (DAGs) are widely used to model the inter-dependencies between individual functions. Cloud-based data processing platforms need to consider the complex topology of DAGs and arbitrary deadlines given by users for job scheduling, leading to an NP-hard decision-making problem. Leveraging spot instances in data processing platforms can achieve significant cost savings, but the unpredictable interruption of spot instances makes the problem of VM scaling and job scheduling more difficult. In this paper, a Reinforcement Learning (RL) based approach called SpotDAG is proposed to solve the auto-scaling problem for jobs modeled as DAGs on a data processing platform where spot instances are introduced. SpotDAG makes cluster scaling and job scheduling decisions at the same time by mapping its output to several meta-policies. This paper introduces the self-attention mechanism for feature extraction to help the intelligent agent learn faster. A mask layer after the output of the proposed RL-based algorithm circumvents illegal actions to ensure that a job is completed by its deadline. Extensive experimental results show that the proposed approach can significantly reduce the cost of instances for data processing platforms while ensuring that jobs are completed in time.
引用
收藏
页码:2904 / 2917
页数:14
相关论文
共 50 条
  • [1] RL-based Scheduling Strategies in Actual Grid Environments
    Costa, Bernardo
    Dutra, Ines
    Mattoso, Marta
    PROCEEDINGS OF THE 2008 INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS, 2008, : 572 - +
  • [2] Efficient Prediction of Makespan Matrix Workflow Scheduling Algorithm for Heterogeneous Cloud Environments
    Zhang, Longxin
    Ai, Minghui
    Tan, Runti
    Man, Junfeng
    Deng, Xiaojun
    Li, Keqin
    JOURNAL OF GRID COMPUTING, 2023, 21 (04)
  • [3] Efficient Prediction of Makespan Matrix Workflow Scheduling Algorithm for Heterogeneous Cloud Environments
    Longxin Zhang
    Minghui Ai
    Runti Tan
    Junfeng Man
    Xiaojun Deng
    Keqin Li
    Journal of Grid Computing, 2023, 21
  • [4] Workflow Scheduling Algorithm based on Reliance Group in Cloud Environments
    Zhang, Yinjuan
    Liu, Bo
    Li, Chen
    Li, Yun
    APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 2203 - 2206
  • [5] An improved Adaptive workflow scheduling Algorithm in cloud Environments
    Zhang, Yinjuan
    Li, Yun
    2015 Third International Conference on Advanced Cloud and Big Data, 2015, : 112 - 116
  • [6] An energy efficient RL based workflow scheduling in cloud computing
    Reddy, Pillareddy Vamsheedhar
    Reddy, Karri Ganesh
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 234
  • [7] A hybrid heuristic workflow scheduling algorithm for cloud computing environments
    Mirzayi, Sahar
    Rafe, Vahid
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2015, 27 (06) : 721 - 735
  • [8] Tri-Objective Workflow Scheduling and Optimization in Heterogeneous Cloud Environments
    Alrammah, Huda
    Gu, Yi
    Liu, Zhifeng
    2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2020), 2020, : 739 - 748
  • [9] An RL-based Model for Optimized Kubernetes Scheduling
    Rothman, John
    Chamanara, Javad
    2023 IEEE 31ST INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS, ICNP, 2023,
  • [10] RL-based Scheduling of an AAM Traffic Network
    Altun, Arinc Tutku
    Xu, Yan
    Inalhan, Gokhan
    Hardt, Michael W.
    2023 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI, 2023, : 87 - 88