Speculative Scheduling for Stochastic HPC Applications

被引:7
|
作者
Gainaru, Ana [1 ]
Pallez , Guillaume [2 ]
Sun, Hongyang [1 ]
Raghavan, Padma [1 ]
机构
[1] Vanderbilt Univ, 221 Kirkland Hall, Nashville, TN 37235 USA
[2] Univ Bordeaux, INRIA, Talence, France
关键词
Scheduling algorithm; HPC runtime; stochastic applications; PERFORMANCE; TASKS;
D O I
10.1145/3337821.3337890
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
New emerging fields are developing a growing number of large-scale applications with heterogeneous, dynamic and data-intensive requirements that put a high emphasis on productivity and thus are not tuned to run efficiently on today's high performance computing (HPC) systems. Some of these applications, such as neuroscience workloads and those that use adaptive numerical algorithms, develop modeling and simulation workflows with stochastic execution times and unpredictable resource requirements. When they are deployed on current HPC systems using existing resource management solutions, it can result in loss of efficiency for the users and decrease in effective system utilization for the platform providers. In this paper, we consider the current HPC scheduling model and describe the challenge it poses for stochastic applications due to the strict requirement in its job deployment policies. To address the challenge, we present speculative scheduling techniques that adapt the resource requirements of a stochastic application on-the-fly, based on its past execution behavior instead of relying on estimates given by the user. We focus on improving the overall system utilization and application response time without disrupting the current HPC scheduling model or the application development process. Our solution can operate alongside existing HPC batch schedulers without interfering with their usage modes. We show that speculative scheduling can improve the system utilization and average application response time by 25-30% compared to the classical HPC approach.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] A Task Scheduling Algorithm for HPC Applications using Colored Stochastic Petri Net Models
    Mironescu, Ion Dan
    Vintan, Lucian
    2017 13TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2017, : 479 - 486
  • [2] Scheduling of Elastic Message Passing Applications on HPC Systems
    Lina, Debolina Halder
    Ghafoor, Sheikh
    Hines, Thomas
    JOB SCHEDULING STRATEGIES FOR PARALLEL PROCESSING, JSSPP 2022, 2023, 13592 : 172 - 191
  • [3] Case Study on Co-Scheduling for HPC Applications
    Breitbart, Jens
    Weidendorfer, Josef
    Trinitis, Carsten
    2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS, 2015, : 277 - 285
  • [4] Scheduling the I/O of HPC applications under congestion
    Gainaru, Ana
    Aupy, Guillaume
    Benoit, Anne
    Cappello, Franck
    Robert, Yves
    Snir, Marc
    2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2015, : 1013 - 1022
  • [5] Evaluating and Improving the Performance and Scheduling of HPC Applications in Cloud
    Gupta, Abhishek
    Faraboschi, Paolo
    Gioachin, Filippo
    Kale, Laxmikant V.
    Kaufmann, Richard
    Lee, Bu-Sung
    March, Verdi
    Milojicic, Dejan
    Suen, Chun Hui
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2016, 4 (03) : 307 - 321
  • [6] OPTIMIZATION OF PERFORMANCE AND SCHEDULING OF HPC APPLICATIONS IN CLOUD USING CLOUDSIM AND SCHEDULING APPROACH
    Muralitharan, D. Boobala
    Reebha, S. Arockia Babi
    Saravanan, D.
    2017 IEEE INTERNATIONAL CONFERENCE ON IOT AND ITS APPLICATIONS (IEEE ICIOT), 2017,
  • [7] Speculative Container Scheduling for Deep Learning Applications in a Kubernetes Cluster
    Mao, Ying
    Fu, Yuqi
    Zheng, Wenjia
    Cheng, Long
    Liu, Qingzhi
    Tao, Dingwen
    IEEE SYSTEMS JOURNAL, 2022, 16 (03): : 3770 - 3781
  • [8] TPS: An Efficient VM Scheduling Algorithm for HPC Applications in Cloud
    Wang, Duoqiang
    Dai, Wei
    Zhang, Chi
    Shi, Xuanhua
    Jin, Hai
    GREEN, PERVASIVE, AND CLOUD COMPUTING (GPC 2017), 2017, 10232 : 152 - 164
  • [9] iCiRe: Optimal Scheduling of HPC Applications in Multi-Cloud
    Kulkarni, Rajesh
    Gameria, Pradeep
    Chahal, Dheeraj
    16TH IEEE/ACM INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING, UCC 2023, 2023,
  • [10] Adaptively Periodic I/O Scheduling for Concurrent HPC Applications
    Zha, Benbo
    Shen, Hong
    ELECTRONICS, 2022, 11 (09)