ForestGOMP: An Efficient OpenMP Environment for NUMA Architectures

被引：44

作者：

Broquedis, Francois ^{[1
]}

Furmento, Nathalie ^{[1
]}

Goglin, Brice ^{[1
]}

Wacrenier, Pierre-Andre ^{[1
]}

Namyst, Raymond ^{[1
]}

机构：

[1] Univ Bordeaux, LaBRI, INRIA Bordeaux Sud Ouest, F-33405 Talence, France

来源：

INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING | 2010年 / 38卷 / 5-6期

关键词：

OpenMP; Memory; NUMA; Hierarchical Thread Scheduling; Multi-Core; PERFORMANCE;

D O I：

10.1007/s10766-010-0136-3

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Exploiting the full computational power of current hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-uniform architecture so as to avoid remote memory access penalties. Directive-based programming languages such as OpenMP, can greatly help to perform such a distribution by providing programmers with an easy way to structure the parallelism of their application and to transmit this information to the runtime system. Our runtime, which is based on a multi-level thread scheduler combined with a NUMA-aware memory manager, converts this information into scheduling hints related to thread-memory affinity issues. These hints enable dynamic load distribution guided by application structure and hardware topology, thus helping to achieve performance portability. Several experiments show that mixed solutions (migrating both threads and data) outperform work-stealing based balancing strategies and next-touch-based data distribution policies. These techniques provide insights about additional optimizations.

引用

页码：418 / 439

页数：22

共 50 条

[21] Task-Parallel Programming on NUMA Architectures
Terboven, Christian
Schmidl, Dirk
Cramer, Tim
Mey, Dieter An
EURO-PAR 2012 PARALLEL PROCESSING, 2012, 7484 : 638 - 649
[22] Porting, monitoring and tuning UPC on NUMA architectures
Mohamed, AS
PDPTA'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS 1-4, 2003, : 1518 - 1525
[23] Chaosity: Understanding Contemporary NUMA-Architectures
Nicholson, Hamish
Nica, Andreea
Raza, Aunn
Sanca, Viktor
Ailamaki, Anastasia
PERFORMANCE EVALUATION AND BENCHMARKING, TPCTC 2023, 2024, 14247 : 59 - 76
[24] A hybrid tool for the performance evaluation of NUMA architectures
Westall, J
Geist, R
PROCEEDINGS OF THE 1997 WINTER SIMULATION CONFERENCE, 1997, : 1029 - 1036
[25] sOMP: Simulating OpenMP Task-Based Applications with NUMA Effects
Daoudi, Idriss
Virouleau, Philippe
Gautier, Thierry
Thibault, Samuel
Aumage, Olivier
OPENMP: PORTABLE MULTI-LEVEL PARALLELISM ON MODERN SYSTEMS, 2020, 12295 : 197 - 211
[26] An Adaptive Concurrent Priority Queue for NUMA Architectures
Strati, Foteini
Giannoula, Christina
Siakavaras, Dimitrios
Goumas, Georgios
Koziris, Nectarios
CF '19 - PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS, 2019, : 135 - 144
[27] Balancing Shared and Distributed Heaps on NUMA Architectures
Aljabri, Malak
Loidl, Hans-Wolfgang
Trinder, Phil
TRENDS IN FUNCTIONAL PROGRAMMING, TFP 2014, 2015, 8843 : 1 - 17
[28] Nap: Persistent Memory Indexes for NUMA Architectures
Wang, Qing
Lu, Youyou
Li, Junru
Xie, Minhui
Shu, Jiwu
ACM TRANSACTIONS ON STORAGE, 2022, 18 (01)
[29] Evaluation of an OpenMP Parallelization of Lucas-Kanade on a NUMA-Manycore
Haggui, Olfa
Tadonki, Claude
Sayadi, Fatma
Ouni, Bouraoui
2018 30TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2018), 2018, : 436 - 441
[30] Visualization of Memory Access Behavior on Hierarchical NUMA Architectures
Weyers, Benjamin
Terboven, Christian
Schmidl, Dirk
Herber, Joachim
Kuhlen, Torsten W.
Uller, Matthias S. M.
Hentschel, Bernd
2014 FIRST WORKSHOP ON VISUAL PERFORMANCE ANALYSIS (VPA), 2014, : 42 - 49

← 1 2 3 4 5 →