Locality-Aware Scheduling of Independent Tasks for Runtime Systems

被引:2
|
作者
Gonthier, Maxime [1 ,2 ]
Marchal, Loris [1 ,2 ]
Thibault, Samuel [3 ]
机构
[1] ENS Lyon, LIP, CNRS, INRIA, Lyon, France
[2] Univ Claude Bernard Lyon 1, Lyon, France
[3] Univ Bordeaux, CNRS, LaBRI, Inria Bordeaux Sud Ouest, Talence, France
关键词
Memory-aware scheduling; Eviction policy; Tasks sharing data; Runtime systems;
D O I
10.1007/978-3-031-06156-1_1
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A now-classical way of meeting the increasing demand for computing speed by HPC applications is the use of GPUs and/or other accelerators. Such accelerators have their own memory, which is usually quite limited, and are connected to the main memory through a bus with bounded bandwidth. Thus, particular care should be devoted to data locality in order to avoid unnecessary data movements. Task-based runtime schedulers have emerged as a convenient and efficient way to use such heterogeneous platforms. When processing an application, the scheduler has the knowledge of all tasks available for processing on a GPU, as well as their input data dependencies. Hence, it is able to order tasks and prefetch their input data in the GPU memory (after possibly evicting some previously-loaded data), while aiming at minimizing data movements, so as to reduce the total processing time. In this paper, we focus on how to schedule tasks that share some of their input data (but are otherwise independent) on a GPU. We provide a formal model of the problem, exhibit an optimal eviction strategy, and show that ordering tasks to minimize data movement is NP-complete. We review and adapt existing ordering strategies to this problem, and propose a new one based on task aggregation. These strategies have been implemented in the STARPU runtime system. We present their performance on tasks from tiled 2D and 3D matrix products. We present their performance on tasks from tiled 2D, 3D matrix products. Our experiments demonstrate that using our new strategy together with the optimal eviction policy reduces the amount of data movement as well as the total processing time.
引用
收藏
页码:5 / 16
页数:12
相关论文
共 50 条
  • [41] Locality-Aware GPU Register File
    Jeon, Hyeran
    Esfeden, Hodjat Asghari
    Abu-Ghazaleh, Nael B.
    Wong, Daniel
    Elango, Sindhuja
    IEEE COMPUTER ARCHITECTURE LETTERS, 2019, 18 (02) : 153 - 156
  • [42] Locality-Aware Laplacian Mesh Smoothing
    Aupy, Guillaume
    Park, JeongHyung
    Raghavan, Padma
    PROCEEDINGS 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - ICPP 2016, 2016, : 588 - 597
  • [43] LaSA: A Locality-aware Scheduling Algorithm for Hadoop-MapReduce Resource Assignment
    Chen, Tseng-Yi
    Wei, Hsin-Wen
    Wei, Ming-Feng
    Chen, Ying-Jie
    Hsu, Tsan-Sheng
    Shih, Wei-Kuan
    PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON COLLABORATION TECHNOLOGIES AND SYSTEMS (CTS), 2013, : 342 - 346
  • [44] Detecting stable locality-aware predicates
    Shen, Min
    Kshemkalyani, Ajay D.
    Khokhar, Ashfaq
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2014, 74 (01) : 1971 - 1983
  • [45] Enabling locality-aware computations in OpenMP
    Huang, Lei
    Jin, Haoqiang
    Yi, Liqi
    Chapman, Barbara
    SCIENTIFIC PROGRAMMING, 2010, 18 (3-4) : 169 - 181
  • [46] Robust locality-aware lookup networks
    Abraham, I
    Malkhi, D
    SELF-STAR PROPERTIES IN COMPLEX INFORMATION SYSTEMS: CONCEPTUAL AND PRACTICAL FOUNDATIONS, 2005, 3460 : 392 - 402
  • [47] LATCH: A Locality-Aware Taint CHecker
    Townley, Daniel
    Khasawneh, Khaled N.
    Ponomarev, Dmitry
    Abu-Ghazaleh, Nael
    Yu, Lei
    MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 969 - 982
  • [48] Locality-aware ratio rule mining
    Hamamoto, Masafumi
    Kitagawa, Hiroyuki
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2007, : 693 - +
  • [49] Data locality-aware and QoS-aware dynamic cloud workflow scheduling in Hadoop for heterogeneous environment
    Ding, Fan
    Ma, Minjin
    INTERNATIONAL JOURNAL OF WEB AND GRID SERVICES, 2023, 19 (01) : 113 - 135
  • [50] Memory-Aware Scheduling of Tasks Sharing Data on Multiple GPUs with Dynamic Runtime Systems
    Gonthier, Maxime
    Marchal, Loris
    Thibault, Samuel
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 694 - 704