The TRegion Interface and Compiler Optimizations for OPENMP Target Regions

被引:8
|
作者
Doerfert, Johannes [1 ]
Diaz, Jose Manuel Monsalve [1 ]
Finkel, Hal [1 ]
机构
[1] Argonne Natl Lab, Argonne Leadership Comp Facil, Argonne, IL 60439 USA
关键词
Compiler optimizations; GPU; Accelerator offloading;
D O I
10.1007/978-3-030-28596-8_11
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
OPENMP is a well established, single-source programming language extension to introduce parallelism into (historically) sequential base languages, namely C/C++ and Fortran. To program not only multi-core CPUs but also many-cores and heavily parallel accelerators, OPENMP 4.0 adopted a flexible offloading scheme inspired by the hierarchy in many GPU designs. The flexible design of the offloading scheme allows to use it in various application scenarios. However, it may also result in a significant performance loss, especially because OPENMP semantics is traditionally interpreted solely in the language front-end as a way to avoid problems with the "sequential-execution-minded" optimization pipeline. Given the limited analysis and transformation capabilities in a modern compiler front-end, the actual syntax used for OPENMP offloading can substantially impact the observed performance. The compiler front-end will always have to favor correct but overly conservative code, if certain facts are not syntactically obvious. \ In this work, we investigate how we can delay (target specific) implementation decisions currently taken early during the compilation of OPENMP offloading code. We prototyped our solution in LLVM/Clang, an industrial strength OPENMP compiler, to show that we can use semantic source code analyses as a rational instead of relying on the user provided syntax. Our preliminary results on the rather simple Rodinia benchmarks already show speedups of up to 1.55x.
引用
收藏
页码:153 / 167
页数:15
相关论文
共 50 条
  • [41] COMPILER OPTIMIZATIONS FOR ELIMINATING BARRIER SYNCHRONIZATION
    TSENG, CW
    SIGPLAN NOTICES, 1995, 30 (08): : 144 - 155
  • [42] Influence of compiler optimizations on system power
    Kandemir, M
    Vijaykrishnan, N
    Irwin, MJ
    Ye, W
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2001, 9 (06) : 801 - 804
  • [43] A compiler framework for speculative analysis and optimizations
    Lin, J
    Chen, T
    Hsu, WC
    Ju, RDC
    Ngai, TF
    Yew, PC
    Chan, S
    ACM SIGPLAN NOTICES, 2003, 38 (05) : 289 - 299
  • [44] Tuning Compiler Optimizations for Simultaneous Multithreading
    Jack L. Lo
    Susan J. Eggers
    Henry M. Levy
    Sujay S. Parekh
    Dean M. Tullsen
    International Journal of Parallel Programming, 1999, 27 : 477 - 503
  • [45] GLOBAL OPTIMIZATIONS IN A PROLOG COMPILER FOR THE TOAM
    ZHOU, NF
    JOURNAL OF LOGIC PROGRAMMING, 1993, 15 (04): : 275 - 294
  • [46] Tuning compiler optimizations for simultaneous multithreading
    Lo, JL
    Eggers, SJ
    Levy, HM
    Parekh, SS
    Tulsen, DM
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 1999, 27 (06) : 477 - 503
  • [47] A Framework for Formal Verification of Compiler Optimizations
    Mansky, William
    Gunter, Elsa
    INTERACTIVE THEOREM PROVING, PROCEEDINGS, 2010, 6172 : 371 - 386
  • [48] A Study of Conflicting Pairs of Compiler Optimizations
    Ben Asher, Yosi
    Haber, Gadi
    Stein, Esti
    2017 IEEE 11TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2017), 2017, : 52 - 58
  • [49] Generating Compiler Optimizations from Proofs
    Tate, Ross
    Stepp, Michael
    Lerner, Sorin
    POPL'10: PROCEEDINGS OF THE 37TH ANNUAL ACM SIGPLAN-SIGACT SYMPOSIUM ON PRINCIPLES OF PROGRAMMING LANGUAGES, 2010, : 389 - 402
  • [50] An Approach for Semiautomatic Locality Optimizations Using OpenMP
    Breitbart, Jens
    APPLIED PARALLEL AND SCIENTIFIC COMPUTING, PT II, 2012, 7134 : 291 - 301