Dtree: Dynamic Task Scheduling at Petascale

被引:2
|
作者
Pamnany, Kiran [1 ]
Misra, Sanchit [1 ]
Md, Vasimuddin [2 ]
Liu, Xing [3 ]
Chow, Edmond [4 ]
Aluru, Srinivas [4 ]
机构
[1] Intel Corp, Parallel Comp Lab, Bangalore, Karnataka, India
[2] Indian Inst Technol, Dept Comp Sci & Engn, Bombay, Maharashtra, India
[3] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA
[4] Georgia Inst Technol, Sch Computat Sci & Engn, Atlanta, GA 30332 USA
来源
HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2015 | 2015年 / 9137卷
关键词
Petascale; Dynamic scheduling; Load balance;
D O I
10.1007/978-3-319-20119-1_10
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Irregular applications are challenging to scale on supercomputers due to the difficulty of balancing load across large numbers of nodes. This challenge is exacerbated by the increasing heterogeneity of modern supercomputers in which nodes often contain multiple processors and coprocessors operating at different speeds, and with differing core and thread counts. We present Dtree, a dynamic task scheduler designed to address this challenge. Dtree shows close to optimal results for a class of HPC applications, improving time-to-solution by achieving near-perfect load balance while consuming negligible resources. We demonstrate Dtree's effectiveness on up to 77,824 heterogeneous cores of the TACC Stampede supercomputer with two different petascale HPC applications: ParaBLe, which performs large-scale Bayesian network structure learning, and GTFock, which implements Fock matrix construction, an essential and expensive step in quantum chemistry codes. For ParaBLe, we show improved performance while eliminating the complexity of managing heterogeneity. For GTFock, we match the most recently published performance without using any application-specific optimizations for data access patterns (such as the task distribution design for communication reduction) that enabled that performance. We also show that Dtree can distribute from tens of thousands to hundreds of millions of irregular tasks across up to 1024 nodes with minimal overhead, while balancing load to within 2% of optimal.
引用
收藏
页码:122 / 138
页数:17
相关论文
共 50 条
  • [41] Dynamic task scheduling with precedence constraints and communication delays
    Maric, S
    Jovanovic, Z
    PARALLEL COMPUTING TECHNOLOGIES, 1999, 1662 : 77 - 91
  • [42] A dynamic task scheduling algorithm for grid computing system
    Zhang, YY
    Inoguchi, Y
    Shen, H
    PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, PROCEEDINGS, 2004, 3358 : 578 - 583
  • [43] Hierarchical scheduling for real-time agile satellite task scheduling in a dynamic environment
    He, Lei
    Liu, Xiao-Lu
    Chen, Ying-Wu
    Xing, Li-Ning
    Liu, Ke
    ADVANCES IN SPACE RESEARCH, 2019, 63 (02) : 897 - 912
  • [44] Linear and dynamic programming algorithms for real-time task scheduling with task duplication
    Zhang, Weizhe
    Hu, Yao
    He, Hui
    Liu, Yawei
    Chen, Allen
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (02): : 494 - 509
  • [45] Linear and dynamic programming algorithms for real-time task scheduling with task duplication
    Weizhe Zhang
    Yao Hu
    Hui He
    Yawei Liu
    Allen Chen
    The Journal of Supercomputing, 2019, 75 : 494 - 509
  • [46] Satellite Relay Task Scheduling Based on Dynamic Antenna Setup Time and Splittable Task
    Liu, Hongyang
    Wang, Ying
    Yu, Peng
    Feng, Yining
    Li, Wenjing
    Qiu, Xuesong
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 3917 - 3922
  • [47] Evaluating Dynamic Task Scheduling in a Task-Based Runtime System for Heterogeneous Architectures
    Becker, Thomas
    Karl, Wolfgang
    Schuele, Tobias
    ARCHITECTURE OF COMPUTING SYSTEMS - ARCS 2019, 2019, 11479 : 142 - 155
  • [48] Task Scheduling Prediction Algorithms for Dynamic Hardware/Software Partitioning
    Quan Haojun
    Zhang Tao
    Guo Jichang
    2012 FIFTH INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING (PAAP), 2012, : 80 - 85
  • [49] Combining Task Scheduling in Power Adaptive Dynamic Reconfigurable System
    Hui Dong
    Le-Tian Huang
    Jun-Shi Wang
    Terrence Mak
    Journal of Electronic Science and Technology, 2012, (04) : 296 - 301
  • [50] Scalable Dynamic Task Scheduling on Adaptive Many-Core
    Venkataramani, Vanchinathan
    Pathania, Anuj
    Shafique, Muhammad
    Mitra, Tulika
    Henkel, Joerg
    2018 IEEE 12TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2018), 2018, : 168 - 175