Autotuning CUDA compiler parameters for heterogeneous applications using the OpenTuner framework

被引：8

作者：

Bruel, Pedro ^{[1
]}

Amaris, Marcos ^{[1
]}

Goldman, Alfredo ^{[1
]}

机构：

[1] Univ Sao Paulo, IME, R Matao 1010,Cidade Univ, Sao Paulo, SP, Brazil

来源：

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2017年 / 29卷 / 22期

基金：

巴西圣保罗研究基金会;

关键词：

autotuning; GPUs; compilers; CUDA; OpenTuner; PERFORMANCE ANALYSIS; MODEL;

D O I：

10.1002/cpe.3973

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

A Graphics Processing Unit (GPU) is a parallel computing coprocessor specialized in accelerating vector operations. The enormous heterogeneity of parallel computing platforms justifies and motivates the development of automated optimization tools and techniques. The Algorithm Selection Problem consists in finding a combination of algorithms, or a configuration of an algorithm, that optimizes the solution of a set of problem instances. An autotuner solves the Algorithm Selection Problem using search and optimization techniques. In this paper, we implement an autotuner for the Compute Unified Device Architecture compiler's parameters using the OpenTuner framework. The autotuner searches for a set of compilation parameters that optimizes the time to solve a problem. We analyze the performance speedups, in comparison with high-level compiler optimizations, achieved in three different GPU devices, for 17 heterogeneous GPU applications, 12 of which are from the Rodinia Benchmark Suite. The autotuner often beats the compiler's high-level optimizations, but underperformed for some problems. We achieved over 2x speedup for Gaussian Elimination and almost 2x speedup for Heart Wall, both problems from the Rodinia Benchmark, and over 4x speedup for a matrix multiplication algorithm. Copyright (c) 2017 John Wiley & Sons, Ltd.

引用

页数：15

共 50 条

[41] The VINEYARD framework for heterogeneous cloud applications: The BrainFrame case
Sidiropoulos, Harry
Chatzikonstantis, George
Soudris, Dimitrios
Strydis, Christos
2018 CONFERENCE ON DESIGN AND ARCHITECTURES FOR SIGNAL AND IMAGE PROCESSING (DASIP), 2018, : 70 - 75
[42] RDF Containers - A Framework for the Integration of Distributed and Heterogeneous Applications
Mordinyi, Richard
Moser, Thomas
Murth, Martin
Kuehn, Eva
Biffl, Stefan
ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2010 WORKSHOPS, 2010, 6428 : 90 - +
[43] Parallelizing Compiler Framework and API for Power Reduction and Software Productivity of Real-Time Heterogeneous Multicores
Hayashi, Akihiro
Wada, Yasutaka
Watanabe, Takeshi
Sekiguchi, Takeshi
Mase, Masayoshi
Shirako, Jun
Kimura, Keiji
Kasahara, Hironori
LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2011, 6548 : 184 - 198
[44] CUDA-Zero: a framework for porting shared memory GPU applications to multi-GPUs
DeHao Chen
WenGuang Chen
WeiMin Zheng
Science China Information Sciences, 2012, 55 : 663 - 676
[45] Energy and Performance Prediction of CUDA Applications using Dynamic Regression Models
Benedict, Shajulin
Rejitha, R. S.
Alex, Suja A.
PROCEEDINGS OF THE 9TH INDIA SOFTWARE ENGINEERING CONFERENCE, 2016, : 37 - 47
[46] CUDA-Zero:a framework for porting shared memory GPU applications to multi-GPUs
CHEN DeHao
Science China(Information Sciences), 2012, 55 (03) : 663 - 676
[47] CUDA-Zero: a framework for porting shared memory GPU applications to multi-GPUs
Chen DeHao
Chen WenGuang
Zheng WeiMin
SCIENCE CHINA-INFORMATION SCIENCES, 2012, 55 (03) : 663 - 676
[48] Flexible neuronal network simulation framework using code generation for NVidia® CUDA™
Thomas Nowotny
BMC Neuroscience, 12 (Suppl 1)
[49] Parallel Performance Analysis for CUDA-Based Co-rank Framework on Bipartite Graphs Heterogeneous Network
Zheng, Fang
Tan, Han
Tian, Fang
2018 17TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS FOR BUSINESS ENGINEERING AND SCIENCE (DCABES), 2018, : 12 - 15
[50] Prediction of Chemical Bond Formation using efficient CUDA based HPC Framework
Kulkarni, Manjiri K.
Umale, J. S.
2016 INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2016,

← 1 2 3 4 5 →