An efficient OpenMP runtime system for hierarchical architectures

被引:0
|
作者
Thibault, Samuel [1 ]
Broquedis, Francois [1 ]
Goglin, Brice [1 ]
Namyst, Raymond [1 ]
Wacrenier, Pierre-Andre [1 ]
机构
[1] LaBRI, INRIA Futurs, F-33405 Talence, France
关键词
OpenMP; nested parallelism; hierarchical thread scheduling; bubbles; multi-core; NUMA; SMP;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Exploiting the full computational power of always deeper hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-uniform architecture. The emergence of mufti-core chips and NUMA machines makes it important to minimize the number of remote memory accesses, to favor cache affinities, and to guarantee fast completion of synchronization steps. By using the BubbleSched platform as a threading backend for the GOMP OpenMP compiler, we are able to easily transpose affinities of thread teams into scheduling hints using abstractions called bubbles. We then propose a scheduling strategy suited to nested OpenMP parallelism. The resulting preliminary performance evaluations show an important improvement of the speedup on a typical NAS OpenMP benchmark application.
引用
收藏
页码:161 / 172
页数:12
相关论文
共 50 条
  • [21] Runtime Aware Architectures
    Valero Cortes, Mateo
    SIGSIM-PADS'18: PROCEEDINGS OF THE 2018 ACM SIGSIM CONFERENCE ON PRINCIPLES OF ADVANCED DISCRETE SIMULATION, 2018, : 3 - 4
  • [22] OpenMP Extensions for Heterogeneous Architectures
    White, Leo
    OPENMP IN THE PETASCALE ERA, (IWOMP 2011), 2011, 6665 : 94 - 107
  • [23] A transparent runtime data distribution engine for OpenMP
    Nikolopoulos, D.S.
    Papatheodorou, T.S.
    Polychronopoulos, C.D.
    Labarta, J.
    Ayguade, E.
    Scientific Programming, 2000, 8 (03) : 143 - 162
  • [24] Runtime Aware Architectures
    Valero, Mateo
    2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 819 - 819
  • [25] Runtime Aware Architectures
    Valero, Matco
    9TH WORKSHOP ON GENERAL PURPOSE PROCESSING USING GPUS (GPGPU 9), 2016, : 1 - 1
  • [26] OpenMP Task Scheduling Analysis via OpenMP Runtime API and Tool Visualization
    Qawasmeh, Ahmad
    Malik, Abid
    Chapman, Barbara
    PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2014, : 1050 - 1059
  • [27] Using Runtime Systems Tools to Implement Efficient Preconditioners for Heterogeneous Architectures
    Roussel, Adrien
    Gratien, Jean-Marc
    Gautier, Thierry
    OIL & GAS SCIENCE AND TECHNOLOGY-REVUE D IFP ENERGIES NOUVELLES, 2016, 71 (06):
  • [28] Runtime Determinacy Race Detection for OpenMP Tasks
    Matar, Hassan Salehe
    Unat, Didem
    EURO-PAR 2018: PARALLEL PROCESSING, 2018, 11014 : 31 - 45
  • [29] Cilk: An efficient multithreaded runtime system
    Blumofe, RD
    Joerg, CF
    Kuszmaul, BC
    Leiserson, CE
    Randall, KH
    Zhou, YL
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1996, 37 (01) : 55 - 69
  • [30] Vectorized Barrier and Reduction in LLVM OpenMP Runtime
    Farooqi, Muhammad Nufail
    Pericas, Miquel
    OPENMP: ENABLING MASSIVE NODE-LEVEL PARALLELISM, IWOMP 2021, 2021, 12870 : 18 - 32