OpenMP to GPGPU: A Compiler Framework for Automatic Translation and Optimization

被引：167

作者：

Lee, Seyong ^{[1
]}

Min, Seung-Jai ^{[1
]}

Eigenmann, Rudolf ^{[1
]}

机构：

[1] Purdue Univ, Sch ECE, W Lafayette, IN 47907 USA

来源：

ACM SIGPLAN NOTICES | 2009年 / 44卷 / 04期

基金：

美国国家科学基金会;

关键词：

Algorithms; Design; Performance; OpenMP; GPU; CUDA; Automatic Translation; Compiler Optimization; PROGRAMS;

D O I：

10.1145/1594835.1504194

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

GPGPUs have recently emerged as powerful vehicles for general-purpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from NVIDIA offers improved programmability for general computing, programming GPGPUs is still complex and error-prone. This paper presents a compiler framework for automatic source-to-source translation of standard OpenMP applications into CUDA-based GPGPU applications. The goal of this translation is to further improve programmability and make existing OpenMP applications amenable to execution on GPGPUs. In this paper, we have identified several key transformation techniques, which enable efficient GPU global memory access, to achieve high performance. Experimental results from two important kernels (JACOBI and SPMUL) and two NAS OpenMP Parallel Benchmarks (EP and CG) show that the described translator and compile-time optimizations work well on both regular and irregular applications, leading to performance improvements of up to 50X over the unoptimized translation (up to 328X over serial on a CPU).

引用

页码：101 / 110

页数：10

共 50 条

[11] Compiler Optimizations for OpenMP
Doerfert, Johannes
Finkel, Hal
EVOLVING OPENMP FOR EVOLVING ARCHITECTURES, 2018, 11128 : 113 - 127
[12] From OpenACC to OpenMP 4: Toward Automatic Translation
Sultana, Nawrin
Calvert, Alexander
Overbey, Jeffrey L.
Arnold, Galen
PROCEEDINGS OF XSEDE16: DIVERSITY, BIG DATA, AND SCIENCE AT SCALE, 2016,
[13] Piper: Pipelining OpenMP Offloading Execution Through Compiler Optimization For Performance
Parasyris, Konstantinos
Georgakoudis, Giorgis
Doerfert, Johannes
Laguna, Ignacio
Scogland, Thomas R. W.
2022 IEEE/ACM INTERNATIONAL WORKSHOP ON PERFORMANCE, PORTABILITY AND PRODUCTIVITY IN HPC (P3HPC), 2022, : 100 - 110
[14] ORC-OpenMP: An OpenMP compiler based on ORC
Chen, YJ
Li, JJ
Wang, SY
Wang, DX
COMPUTATIONAL SCIENCE - ICCS 2004, PT 3, PROCEEDINGS, 2004, 3038 : 414 - 423
[15] The Implementation of a High Performance GPGPU Compiler
Yi Yang
Huiyang Zhou
International Journal of Parallel Programming, 2013, 41 : 768 - 781
[16] A Novel Scheme for Compiler Optimization Framework
Chebolu, N. A. B. Sankar
Wankar, Rajeev
2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2015, : 2374 - 2380
[17] The Implementation of a High Performance GPGPU Compiler
Yang, Yi
Zhou, Huiyang
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2013, 41 (06) : 768 - 781
[18] A C++ infrastructure for automatic introduction and translation of OpenMP directives
Quinlan, D
Schordan, M
Yi, Q
de Supinski, BR
OPENMP SHARED MEMORY PARALLEL PROGRAMMING, 2003, 2716 : 13 - 25
[19] OpenMP compiler for distributed memory architectures
WANG Jue
ScienceChina(InformationSciences), 2010, 53 (05) : 932 - 944
[20] CCRG OpenMP Compiler: Experiments and improvements
Huang Chun
Yang Xuejun
OPENMP SHARED MEMORY PARALLEL PROGRAMMING, PROCEEDINGS, 2008, 4315 : 51 - 62

← 1 2 3 4 5 →