ForestGOMP: An Efficient OpenMP Environment for NUMA Architectures

被引:44
|
作者
Broquedis, Francois [1 ]
Furmento, Nathalie [1 ]
Goglin, Brice [1 ]
Wacrenier, Pierre-Andre [1 ]
Namyst, Raymond [1 ]
机构
[1] Univ Bordeaux, LaBRI, INRIA Bordeaux Sud Ouest, F-33405 Talence, France
关键词
OpenMP; Memory; NUMA; Hierarchical Thread Scheduling; Multi-Core; PERFORMANCE;
D O I
10.1007/s10766-010-0136-3
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Exploiting the full computational power of current hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-uniform architecture so as to avoid remote memory access penalties. Directive-based programming languages such as OpenMP, can greatly help to perform such a distribution by providing programmers with an easy way to structure the parallelism of their application and to transmit this information to the runtime system. Our runtime, which is based on a multi-level thread scheduler combined with a NUMA-aware memory manager, converts this information into scheduling hints related to thread-memory affinity issues. These hints enable dynamic load distribution guided by application structure and hardware topology, thus helping to achieve performance portability. Several experiments show that mixed solutions (migrating both threads and data) outperform work-stealing based balancing strategies and next-touch-based data distribution policies. These techniques provide insights about additional optimizations.
引用
收藏
页码:418 / 439
页数:22
相关论文
共 50 条
  • [31] Compiler Support for Selective Page Migration in NUMA Architectures
    Piccoli, Guilherme
    Santos, Henrique N.
    Rodrigues, Raphael E.
    Pousa, Christiane
    Borin, Edson
    Magno, Fernando
    Pereira, Quintao
    PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, : 369 - 380
  • [32] Optimized Execution Strategies for Sequence Aligners on NUMA Architectures
    Lenis, Josefina
    Senar, Miquel Angel
    EURO-PAR 2016: PARALLEL PROCESSING WORKSHOPS, 2017, 10104 : 492 - 503
  • [33] A Tool to Analyze the Performance of Multithreaded Programs on NUMA Architectures
    Liu, Xu
    Mellor-Crummey, John
    ACM SIGPLAN NOTICES, 2014, 49 (08) : 259 - 271
  • [34] Data access collection and data partitioning for NUMA architectures
    Calidonna, CR
    Furnari, MM
    ADVANCES IN COMPUTATIONAL MECHANICS WITH HIGH PERFORMANCE COMPUTING, 1998, : 33 - 40
  • [35] Parallel simulations of seismic wave propagation on NUMA architectures
    Dupros, Fabrice
    Pousa Ribeiro, Christiane
    Carissimi, Alexandre
    Mehaut, Jean-Francois
    PARALLEL COMPUTING: FROM MULTICORES AND GPU'S TO PETASCALE, 2010, 19 : 67 - 74
  • [36] Parallel programming environment for OpenMP
    Park, Insung
    Voss, Michael J.
    Kim, Seon Wook
    Eigenmann, Rudolf
    Scientific Programming, 2001, 9 (2-3) : 143 - 162
  • [37] An environment for OpenMP code parallelization
    Ierotheou, CS
    Jin, H
    Matthews, G
    Johnson, SP
    Hood, R
    PARALLEL COMPUTING: SOFTWARE TECHNOLOGY, ALGORITHMS, ARCHITECTURES AND APPLICATIONS, 2004, 13 : 811 - 818
  • [38] Binding Nested OpenMP Programs on Hierarchical Memory Architectures
    Schmidl, Dirk
    Terboven, Christian
    Mey, Dieter An
    Buecker, Martin
    BEYOND LOOP LEVEL PARALLELISM IN OPENMP: ACCELERATORS, TASKING AND MORE, PROCEEDINGS, 2010, 6132 : 29 - +
  • [39] Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures
    Mallon, Damian A.
    Taboada, Guillermo L.
    Teijeiro, Carlos
    Tourino, Juan
    Fraguela, Basilio B.
    Gomez, Andres
    Doallo, Ramon
    Carlos Mourino, J.
    RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, PROCEEDINGS, 2009, 5759 : 174 - +
  • [40] A parallel EM algorithm for Gaussian Mixture Models implemented on a NUMA system using OpenMP
    Kwedlo, Wojciech
    2014 22ND EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2014), 2014, : 292 - 298