The TRegion Interface and Compiler Optimizations for OPENMP Target Regions

被引:8
|
作者
Doerfert, Johannes [1 ]
Diaz, Jose Manuel Monsalve [1 ]
Finkel, Hal [1 ]
机构
[1] Argonne Natl Lab, Argonne Leadership Comp Facil, Argonne, IL 60439 USA
关键词
Compiler optimizations; GPU; Accelerator offloading;
D O I
10.1007/978-3-030-28596-8_11
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
OPENMP is a well established, single-source programming language extension to introduce parallelism into (historically) sequential base languages, namely C/C++ and Fortran. To program not only multi-core CPUs but also many-cores and heavily parallel accelerators, OPENMP 4.0 adopted a flexible offloading scheme inspired by the hierarchy in many GPU designs. The flexible design of the offloading scheme allows to use it in various application scenarios. However, it may also result in a significant performance loss, especially because OPENMP semantics is traditionally interpreted solely in the language front-end as a way to avoid problems with the "sequential-execution-minded" optimization pipeline. Given the limited analysis and transformation capabilities in a modern compiler front-end, the actual syntax used for OPENMP offloading can substantially impact the observed performance. The compiler front-end will always have to favor correct but overly conservative code, if certain facts are not syntactically obvious. \ In this work, we investigate how we can delay (target specific) implementation decisions currently taken early during the compilation of OPENMP offloading code. We prototyped our solution in LLVM/Clang, an industrial strength OPENMP compiler, to show that we can use semantic source code analyses as a rational instead of relying on the user provided syntax. Our preliminary results on the rather simple Rodinia benchmarks already show speedups of up to 1.55x.
引用
收藏
页码:153 / 167
页数:15
相关论文
共 50 条
  • [31] Compiler optimizations for processors with SIMD instructions
    Pryanishnikov, Ivan
    Krall, Andreas
    Horspool, Nigel
    SOFTWARE-PRACTICE & EXPERIENCE, 2007, 37 (01): : 93 - 113
  • [32] Tuning compiler optimizations for simultaneous multithreading
    Lo, JL
    Eggers, SJ
    Levy, HM
    Parekh, SS
    Tullsen, DM
    THIRTIETH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1997, : 114 - 124
  • [33] Influence of compiler optimizations on value prediction
    Sato, T
    Hamano, A
    Sugitani, K
    Arita, I
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 2001, 2110 : 312 - 321
  • [34] Generating Compiler Optimizations from Proofs
    Tate, Ross
    Stepp, Michael
    Lerner, Sorin
    ACM SIGPLAN NOTICES, 2010, 45 (01) : 389 - 402
  • [35] Advanced Compiler Optimizations for Sparse Computations
    J Parallel Distrib Comput, (14):
  • [36] Influence of compiler optimizations on system power
    Kandemir, M
    Vijaykrishnan, N
    Irwin, MJ
    Ye, W
    37TH DESIGN AUTOMATION CONFERENCE, PROCEEDINGS 2000, 2000, : 304 - 307
  • [37] ADVANCED COMPILER OPTIMIZATIONS FOR SPARSE COMPUTATIONS
    BIK, AJC
    WIJSHOFF, HAG
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1995, 31 (01) : 14 - 24
  • [38] COMP: Compiler Optimizations for Manycore Processors
    Song, Linhai
    Feng, Min
    Ravi, Nishkam
    Yang, Yi
    Chakradhar, Srimat
    2014 47TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2014, : 659 - 671
  • [39] COMPILER OPTIMIZATIONS FOR IMPROVING DATA LOCALITY
    CARR, S
    MCKINLEY, KS
    TSENG, CW
    SIGPLAN NOTICES, 1994, 29 (11): : 252 - 262
  • [40] Effect of compiler optimizations on memory energy
    Kim, HS
    Irwin, MJ
    Vijaykrishnan, N
    Kandemir, M
    2000 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS: DESIGN AND IMPLEMENTATION, 2000, : 663 - 672