Exploring Plan-Based Scheduling for Large-Scale Computing Systems

被引:9
|
作者
Zheng, Xingwu [1 ]
Zhou, Zhou [2 ]
Yang, Xu [2 ]
Lan, Zhiling [2 ]
Wang, Jia [1 ]
机构
[1] IIT, Dept Elect & Comp Engn, Chicago, IL 60616 USA
[2] IIT, Dept Comp Sci, Chicago, IL 60616 USA
关键词
Plan-based scheduling; Simulated Annealing algorithm; Optimization;
D O I
10.1109/CLUSTER.2016.43
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As HPC systems scale toward exascale, it becomes critical to manage the underlying resource more effectively. While almost all existing resource management systems schedule jobs in a queuing fashion and have drawbacks of making isolated scheduling decisions that would compromise system performance even with backfilling, plan-based schedulers have the potential to generate better job schedules by producing an execution plan of all waiting jobs but do not receive enough attention. In this paper, we present a novel plan-based scheduling system that utilizes simulated annealing as the optimization engine to support effective resource management on HPC systems. As demonstrated by extensive trace-based simulations with workload traces collected from a wide range of production supercomputers, in comparison with the queue-based scheduling system using FCFS with EASY backfilling, our plan-based scheduling system can reduce the job wait time by 40%, reduce the job response time by 30%, while slightly improving system utilization at the same time. Moreover, our plan-based system is able to run online by solving the scheduling problem at each scheduling iteration within one second, making it practical for production HPC systems.
引用
收藏
页码:259 / 268
页数:10
相关论文
共 50 条
  • [41] On the Robustness of the Soft State for Task Scheduling in Large-scale Distributed Computing Environment
    Tada, Harumasa
    Imase, Makoto
    Murata, Masayuki
    2008 INTERNATIONAL MULTICONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (IMCSIT), VOLS 1 AND 2, 2008, : 435 - +
  • [42] Generalizing the Utility of GPUs in Large-Scale Heterogeneous Computing Systems
    Xiao, Shucai
    Feng, Wu-chun
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 2554 - 2557
  • [43] Experience Transfer for the Configuration Tuning in Large-Scale Computing Systems
    Chen, Haifeng
    Zhang, Wenxuan
    Jiang, Guofei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (03) : 388 - 401
  • [44] Greening Duplication-Based Dependent-Tasks Scheduling on Heterogeneous Large-Scale Computing Platforms
    Tarek Hagras
    Asmaa Atef
    Yousef B. Mahdy
    Journal of Grid Computing, 2021, 19
  • [45] Greening Duplication-Based Dependent-Tasks Scheduling on Heterogeneous Large-Scale Computing Platforms
    Hagras, Tarek
    Atef, Asmaa
    Mahdy, Yousef B.
    JOURNAL OF GRID COMPUTING, 2021, 19 (01)
  • [46] A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems
    Aftab Ahmed Chandio
    Kashif Bilal
    Nikos Tziritas
    Zhibin Yu
    Qingshan Jiang
    Samee U. Khan
    Cheng-Zhong Xu
    Cluster Computing, 2014, 17 : 1349 - 1367
  • [47] A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems
    Chandio, Aftab Ahmed
    Bilal, Kashif
    Tziritas, Nikos
    Yu, Zhibin
    Jiang, Qingshan
    Khan, Samee U.
    Xu, Cheng-Zhong
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2014, 17 (04): : 1349 - 1367
  • [48] Large electromagnetic simulation by hybrid approach on large-scale parallel computing systems
    Alexandru, Mihai
    Monteil, Thierry
    Lorenz, Petr
    Coccetti, Fabio
    Aubert, Herve
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (13): : 3184 - 3204
  • [49] MODEL MANIPULATION IN DECISION SUPPORT SYSTEMS - A PLAN-BASED APPROACH
    BUHULAIGA, MI
    DATA BASE, 1987, 18 (03): : 42 - 42
  • [50] A DQN based approach for large-scale EVs charging scheduling
    Han, Yingnan
    Li, Tianyang
    Wang, Qingzhu
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (06) : 8319 - 8339