An efficient OpenMP runtime system for hierarchical architectures

被引：0

作者：

Thibault, Samuel ^{[1
]}

Broquedis, Francois ^{[1
]}

Goglin, Brice ^{[1
]}

Namyst, Raymond ^{[1
]}

Wacrenier, Pierre-Andre ^{[1
]}

机构：

[1] LaBRI, INRIA Futurs, F-33405 Talence, France

来源：

PRACTICAL PROGRAMMING MODEL FOR THE MULTI-CORE ERA, PROCEEDINGS | 2008年 / 4935卷

关键词：

OpenMP; nested parallelism; hierarchical thread scheduling; bubbles; multi-core; NUMA; SMP;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Exploiting the full computational power of always deeper hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-uniform architecture. The emergence of mufti-core chips and NUMA machines makes it important to minimize the number of remote memory accesses, to favor cache affinities, and to guarantee fast completion of synchronization steps. By using the BubbleSched platform as a threading backend for the GOMP OpenMP compiler, we are able to easily transpose affinities of thread teams into scheduling hints using abstractions called bubbles. We then propose a scheduling strategy suited to nested OpenMP parallelism. The resulting preliminary performance evaluations show an important improvement of the speedup on a typical NAS OpenMP benchmark application.

引用

页码：161 / 172

页数：12

共 50 条

[21] Runtime Aware Architectures
Valero Cortes, Mateo
SIGSIM-PADS'18: PROCEEDINGS OF THE 2018 ACM SIGSIM CONFERENCE ON PRINCIPLES OF ADVANCED DISCRETE SIMULATION, 2018, : 3 - 4
[22] OpenMP Extensions for Heterogeneous Architectures
White, Leo
OPENMP IN THE PETASCALE ERA, (IWOMP 2011), 2011, 6665 : 94 - 107
[23] A transparent runtime data distribution engine for OpenMP
Nikolopoulos, D.S.
Papatheodorou, T.S.
Polychronopoulos, C.D.
Labarta, J.
Ayguade, E.
Scientific Programming, 2000, 8 (03) : 143 - 162
[24] Runtime Aware Architectures
Valero, Mateo
2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 819 - 819
[25] Runtime Aware Architectures
Valero, Matco
9TH WORKSHOP ON GENERAL PURPOSE PROCESSING USING GPUS (GPGPU 9), 2016, : 1 - 1
[26] OpenMP Task Scheduling Analysis via OpenMP Runtime API and Tool Visualization
Qawasmeh, Ahmad
Malik, Abid
Chapman, Barbara
PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2014, : 1050 - 1059
[27] Using Runtime Systems Tools to Implement Efficient Preconditioners for Heterogeneous Architectures
Roussel, Adrien
Gratien, Jean-Marc
Gautier, Thierry
OIL & GAS SCIENCE AND TECHNOLOGY-REVUE D IFP ENERGIES NOUVELLES, 2016, 71 (06):
[28] Runtime Determinacy Race Detection for OpenMP Tasks
Matar, Hassan Salehe
Unat, Didem
EURO-PAR 2018: PARALLEL PROCESSING, 2018, 11014 : 31 - 45
[29] Cilk: An efficient multithreaded runtime system
Blumofe, RD
Joerg, CF
Kuszmaul, BC
Leiserson, CE
Randall, KH
Zhou, YL
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1996, 37 (01) : 55 - 69
[30] Vectorized Barrier and Reduction in LLVM OpenMP Runtime
Farooqi, Muhammad Nufail
Pericas, Miquel
OPENMP: ENABLING MASSIVE NODE-LEVEL PARALLELISM, IWOMP 2021, 2021, 12870 : 18 - 32

← 1 2 3 4 5 →