TASKWORK: A Cloud-aware Runtime System for Elastic Task-parallel HPC Applications

被引:4
|
作者
Kehrer, Stefan [1 ]
Blochinger, Wolfgang [1 ]
机构
[1] Reutlingen Univ, Parallel & Distributed Comp Grp, Reutlingen, Germany
关键词
Cloud Computing; High Performance Computing; Task Parallelism; Elasticity of Parallel Computations;
D O I
10.5220/0007795501980209
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the capability of employing virtually unlimited compute resources, the cloud evolved into an attractive execution environment for applications from the High Performance Computing (HPC) domain. By means of elastic scaling, compute resources can be provisioned and decommissioned at runtime. This gives rise to a new concept in HPC: Elasticity of parallel computations. However, it is still an open research question to which extent HPC applications can benefit from elastic scaling and how to leverage elasticity of parallel computations. In this paper, we discuss how to address these challenges for HPC applications with dynamic task parallelism and present TASKWORK, a cloud-aware runtime system based on our findings. TASKWORK enables the implementation of elastic HPC applications by means of higher-level development frameworks and solves corresponding coordination problems based on Apache ZooKeeper. For evaluation purposes, we discuss a development framework for parallel branch-and-bound based on TASKWORK, show how to implement an elastic HPC application, and report on measurements with respect to parallel efficiency and elastic scaling.
引用
收藏
页码:198 / 209
页数:12
相关论文
共 50 条
  • [1] CHRT: a Criticality- and Heterogeneity-Aware Runtime System for Task-Parallel Applications
    Han, Myeonggyun
    Park, Jinsu
    Baek, Woongki
    PROCEEDINGS OF THE 2017 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2017, : 942 - 945
  • [2] Design and Implementation of a Criticality- and Heterogeneity-Aware Runtime System for Task-Parallel Applications
    Han, Myeonggyun
    Park, Jinsu
    Baek, Woongki
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (05) : 1117 - 1132
  • [3] CRC-based Memory Reliability for Task-parallel HPC Applications
    Subasi, Omer
    Unsal, Osman
    Labarta, Jesus
    Yalcin, Gulay
    Cristal, Adrian
    2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2016), 2016, : 1101 - 1112
  • [4] An Elasticity Description Language for Task-parallel Cloud Applications
    Haussmann, Jens
    Blochinger, Wolfgang
    Kuechlin, Wolfgang
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE (CLOSER), 2020, : 473 - 481
  • [5] Energy-Aware Scheduler for HPC Parallel Task Base Applications in Cloud Computing
    Juarez, Fredy
    Ejarque, Jorge
    Badia, Rosa M.
    Gonzalez Rocha, Sergio N.
    Esquivel-Flores, Oscar A.
    INTERNATIONAL JOURNAL OF COMBINATORIAL OPTIMIZATION PROBLEMS AND INFORMATICS, 2018, 9 (01): : 54 - 61
  • [6] Task-parallel Runtime System Optimization Using Static Compiler Analysis
    Thoman, Peter
    Zangerl, Peter
    Fahringer, Thomas
    ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2017, 2017, : 201 - 210
  • [7] Cloud-aware Development of Scientific Applications
    De Benedictis, Alessandra
    Rak, Massimiliano
    Turtur, Mauro
    Villano, Umberto
    2014 IEEE 23RD INTERNATIONAL WETICE CONFERENCE (WETICE), 2014, : 149 - 154
  • [8] POSTER: LB-HM: Load Balance-Aware Data Placement on Heterogeneous Memory for Task-Parallel HPC Applications
    Xie, Zhen
    Liu, Jie
    Ma, Sam
    Li, Jiajia
    Li, Dong
    PPOPP'22: PROCEEDINGS OF THE 27TH ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2022, : 435 - 436
  • [9] Locality-Aware Task-Parallel Execution on GPUs
    Hbeika, Jad
    Kulkarni, Milind
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2016, 2017, 10136 : 250 - 264
  • [10] Energy-aware strategies for task-parallel sparse linear system solvers
    Aliaga, Jose I.
    Barreda, Maria
    Castano, Asuncion
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (06):