SpotDAG: An RL-Based Algorithm for DAG Workflow Scheduling in Heterogeneous Cloud Environments

被引:2
|
作者
Lin, Liduo [1 ]
Pan, Li [1 ]
Liu, Shijun [1 ]
机构
[1] Shandong Univ, Sch Software, Jinan, Peoples R China
基金
国家重点研发计划;
关键词
Data processing; Job shop scheduling; Costs; Cloud computing; Optimization; Task analysis; Data models; Heterogeneous cloud environments; spot instance; on-demand instance; IaaS; TASKS;
D O I
10.1109/TSC.2024.3422828
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As increasingly complex functions are implemented in applications, directed acyclic graphs (DAGs) are widely used to model the inter-dependencies between individual functions. Cloud-based data processing platforms need to consider the complex topology of DAGs and arbitrary deadlines given by users for job scheduling, leading to an NP-hard decision-making problem. Leveraging spot instances in data processing platforms can achieve significant cost savings, but the unpredictable interruption of spot instances makes the problem of VM scaling and job scheduling more difficult. In this paper, a Reinforcement Learning (RL) based approach called SpotDAG is proposed to solve the auto-scaling problem for jobs modeled as DAGs on a data processing platform where spot instances are introduced. SpotDAG makes cluster scaling and job scheduling decisions at the same time by mapping its output to several meta-policies. This paper introduces the self-attention mechanism for feature extraction to help the intelligent agent learn faster. A mask layer after the output of the proposed RL-based algorithm circumvents illegal actions to ensure that a job is completed by its deadline. Extensive experimental results show that the proposed approach can significantly reduce the cost of instances for data processing platforms while ensuring that jobs are completed in time.
引用
收藏
页码:2904 / 2917
页数:14
相关论文
共 50 条
  • [41] Load balance based workflow job scheduling algorithm in distributed cloud
    Li, Chunlin
    Tang, Jianhang
    Ma, Tao
    Yang, Xihao
    Luo, Youlong
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2020, 152
  • [42] Optimized Cost-Based Biomedical Workflow Scheduling Algorithm in Cloud
    Mohanapriya, N.
    Kousalya, G.
    ADVANCES IN SIGNAL PROCESSING AND INTELLIGENT RECOGNITION SYSTEMS, 2018, 678 : 439 - 448
  • [43] Scheduling Workflow in Cloud Computing Based on Ant Colony Optimization Algorithm
    Zhou, Yue
    Huang, XinLi
    2013 SIXTH INTERNATIONAL CONFERENCE ON BUSINESS INTELLIGENCE AND FINANCIAL ENGINEERING (BIFE), 2014, : 57 - 61
  • [44] Budget constrained Priority based Genetic Algorithm for workflow scheduling in cloud
    Verma, Amandeep
    Kaushal, Sakshi
    IET Conference Publications, 2013, 2013 (645 CP): : 216 - 222
  • [45] Critical Path Based Scheduling Algorithm for Workflow Applications in Cloud Computing
    Jailalita
    Singh, Sarbjeet
    Dutta, Maitreyee
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND AUTOMATION (ICACCA 2016), 2016, : 276 - 281
  • [46] Workflow Scheduling Algorithm based on Control Structure Reduction in Cloud Environment
    Li, Huifang
    Liu, Haitao
    Li, Jianqiang
    2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 2587 - 2592
  • [47] RVEA-based multi-objective workflow scheduling in cloud environments
    Xue, Fei
    Hai, Qiuru
    Gong, Yuelu
    You, Siqing
    Cao, Yang
    Tang, Hengliang
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2022, 20 (01) : 49 - 57
  • [48] Delay-Based Workflow Scheduling for Cost Optimization in Heterogeneous Cloud System
    Kumar, Madhu Sudan
    Gupta, Indrajeet
    Jana, Prasanta K.
    2017 TENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2017, : 223 - 228
  • [49] Aggregation Measure Factor-Based Workflow Application Scheduling in Heterogeneous Environments
    Sun, Ting
    Zhang, Yaqin
    Xiong, Kaiqi
    Xiao, Chuangbai
    IEEE ACCESS, 2020, 8 : 89850 - 89865
  • [50] A Fault Tolerant Scheduling Algorithm for DAG Applications in Cluster Environments
    Tabbaa, Nabil
    Entezari-Maleki, Reza
    Movaghar, Ali
    DIGITAL INFORMATION PROCESSING AND COMMUNICATIONS, PT 1, 2011, 188 : 189 - 199