An efficient OpenMP runtime system for hierarchical architectures

被引:0
|
作者
Thibault, Samuel [1 ]
Broquedis, Francois [1 ]
Goglin, Brice [1 ]
Namyst, Raymond [1 ]
Wacrenier, Pierre-Andre [1 ]
机构
[1] LaBRI, INRIA Futurs, F-33405 Talence, France
关键词
OpenMP; nested parallelism; hierarchical thread scheduling; bubbles; multi-core; NUMA; SMP;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Exploiting the full computational power of always deeper hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-uniform architecture. The emergence of mufti-core chips and NUMA machines makes it important to minimize the number of remote memory accesses, to favor cache affinities, and to guarantee fast completion of synchronization steps. By using the BubbleSched platform as a threading backend for the GOMP OpenMP compiler, we are able to easily transpose affinities of thread teams into scheduling hints using abstractions called bubbles. We then propose a scheduling strategy suited to nested OpenMP parallelism. The resulting preliminary performance evaluations show an important improvement of the speedup on a typical NAS OpenMP benchmark application.
引用
收藏
页码:161 / 172
页数:12
相关论文
共 50 条
  • [31] Extending OpenMP Metadirective Semantics for Runtime Adaptation
    Yan, Yonghong
    Wang, Anjia
    Liao, Chunhua
    Scogland, Thomas R. W.
    de Supinski, Bronis R.
    OPENMP: CONQUERING THE FULL HARDWARE SPECTRUM, IWOMP 2019, 2019, 11718 : 201 - 214
  • [32] Cilk: An Efficient Multithreaded Runtime System
    Blumofe, R. D.
    Joerg, C. F.
    Kuszmaul, B. C.
    Leiserson, C. E.
    Journal of Parallel and Distributed Computing, 37 (01):
  • [33] CILK - AN EFFICIENT MULTITHREADED RUNTIME SYSTEM
    BLUMOFE, RD
    JOERG, CF
    KUSZMAUL, BC
    LEISERSON, CE
    RANDALL, KH
    ZHOU, YL
    SIGPLAN NOTICES, 1995, 30 (08): : 207 - 216
  • [34] A HIERARCHICAL TAXONOMIC SYSTEM FOR COMPUTER ARCHITECTURES
    DASGUPTA, S
    COMPUTER, 1990, 23 (03) : 64 - 74
  • [35] System-Level Runtime Mapping Exploration of Reconfigurable Architectures
    Sigdel, Kamana
    Thompson, Mark
    Pimentel, Andy D.
    Galuzzi, Carlo
    Bertels, Koen
    2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 2921 - +
  • [36] ADAPTIVE COMMUNICATION ARCHITECTURES FOR RUNTIME RECONFIGURABLE SYSTEM-ON-CHIPS
    Pionteck, Thilo
    Albrecht, Carsten
    Koch, Roman
    Maehle, Erik
    PARALLEL PROCESSING LETTERS, 2008, 18 (02) : 275 - 289
  • [37] A Unified Runtime System for Heterogeneous Multi-core Architectures
    Augonnet, Cedric
    Namyst, Raymond
    EURO-PAR 2008 WORKSHOPS - PARALLEL PROCESSING, 2009, 5415 : 174 - 183
  • [38] Runtime Resource Management in Heterogeneous System Architectures: The SAVE Approach
    Durelli, Gianluca C.
    Pogliani, Marcello
    Miele, Antonio
    Plessl, Christian
    Riebler, Heinrich
    Santambrogio, Marco D.
    Vaz, Gavin
    Bolchini, Cristiana
    2014 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA), 2014, : 142 - 149
  • [39] OpenMP compiler for distributed memory architectures
    WANG Jue
    ScienceChina(InformationSciences), 2010, 53 (05) : 932 - 944
  • [40] Area Efficient Functional Locking through Coarse Grained Runtime Reconfigurable Architectures
    Chen, Jianqi
    Schafer, Benjamin Carrion
    2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 542 - 547