ForestGOMP: An Efficient OpenMP Environment for NUMA Architectures

被引:44
|
作者
Broquedis, Francois [1 ]
Furmento, Nathalie [1 ]
Goglin, Brice [1 ]
Wacrenier, Pierre-Andre [1 ]
Namyst, Raymond [1 ]
机构
[1] Univ Bordeaux, LaBRI, INRIA Bordeaux Sud Ouest, F-33405 Talence, France
关键词
OpenMP; Memory; NUMA; Hierarchical Thread Scheduling; Multi-Core; PERFORMANCE;
D O I
10.1007/s10766-010-0136-3
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Exploiting the full computational power of current hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-uniform architecture so as to avoid remote memory access penalties. Directive-based programming languages such as OpenMP, can greatly help to perform such a distribution by providing programmers with an easy way to structure the parallelism of their application and to transmit this information to the runtime system. Our runtime, which is based on a multi-level thread scheduler combined with a NUMA-aware memory manager, converts this information into scheduling hints related to thread-memory affinity issues. These hints enable dynamic load distribution guided by application structure and hardware topology, thus helping to achieve performance portability. Several experiments show that mixed solutions (migrating both threads and data) outperform work-stealing based balancing strategies and next-touch-based data distribution policies. These techniques provide insights about additional optimizations.
引用
收藏
页码:418 / 439
页数:22
相关论文
共 50 条
  • [21] Task-Parallel Programming on NUMA Architectures
    Terboven, Christian
    Schmidl, Dirk
    Cramer, Tim
    Mey, Dieter An
    EURO-PAR 2012 PARALLEL PROCESSING, 2012, 7484 : 638 - 649
  • [22] Porting, monitoring and tuning UPC on NUMA architectures
    Mohamed, AS
    PDPTA'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS 1-4, 2003, : 1518 - 1525
  • [23] Chaosity: Understanding Contemporary NUMA-Architectures
    Nicholson, Hamish
    Nica, Andreea
    Raza, Aunn
    Sanca, Viktor
    Ailamaki, Anastasia
    PERFORMANCE EVALUATION AND BENCHMARKING, TPCTC 2023, 2024, 14247 : 59 - 76
  • [24] A hybrid tool for the performance evaluation of NUMA architectures
    Westall, J
    Geist, R
    PROCEEDINGS OF THE 1997 WINTER SIMULATION CONFERENCE, 1997, : 1029 - 1036
  • [25] sOMP: Simulating OpenMP Task-Based Applications with NUMA Effects
    Daoudi, Idriss
    Virouleau, Philippe
    Gautier, Thierry
    Thibault, Samuel
    Aumage, Olivier
    OPENMP: PORTABLE MULTI-LEVEL PARALLELISM ON MODERN SYSTEMS, 2020, 12295 : 197 - 211
  • [26] An Adaptive Concurrent Priority Queue for NUMA Architectures
    Strati, Foteini
    Giannoula, Christina
    Siakavaras, Dimitrios
    Goumas, Georgios
    Koziris, Nectarios
    CF '19 - PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS, 2019, : 135 - 144
  • [27] Balancing Shared and Distributed Heaps on NUMA Architectures
    Aljabri, Malak
    Loidl, Hans-Wolfgang
    Trinder, Phil
    TRENDS IN FUNCTIONAL PROGRAMMING, TFP 2014, 2015, 8843 : 1 - 17
  • [28] Nap: Persistent Memory Indexes for NUMA Architectures
    Wang, Qing
    Lu, Youyou
    Li, Junru
    Xie, Minhui
    Shu, Jiwu
    ACM TRANSACTIONS ON STORAGE, 2022, 18 (01)
  • [29] Evaluation of an OpenMP Parallelization of Lucas-Kanade on a NUMA-Manycore
    Haggui, Olfa
    Tadonki, Claude
    Sayadi, Fatma
    Ouni, Bouraoui
    2018 30TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2018), 2018, : 436 - 441
  • [30] Visualization of Memory Access Behavior on Hierarchical NUMA Architectures
    Weyers, Benjamin
    Terboven, Christian
    Schmidl, Dirk
    Herber, Joachim
    Kuhlen, Torsten W.
    Uller, Matthias S. M.
    Hentschel, Bernd
    2014 FIRST WORKSHOP ON VISUAL PERFORMANCE ANALYSIS (VPA), 2014, : 42 - 49