Piecewise Holistic Autotuning of Compiler and Runtime Parameters

被引:4
|
作者
Popov, Mihail [1 ]
Akel, Chadi [2 ]
Jalby, William [1 ]
Castro, Pablo de Oliveira [1 ]
机构
[1] Univ Versailles St Quentin En Yvelines, Univ Paris Saclay, Versailles, France
[2] Exascale Comp Res, Versailles, France
来源
关键词
CODE;
D O I
10.1007/978-3-319-43659-3_18
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Current architecture complexity requires fine tuning of compiler and runtime parameters to achieve full potential performance. Auto-tuning substantially improves default parameters in many scenarios but it is a costly process requiring a long iterative evaluation. We propose an automatic piecewise autotuner based on CERE (Codelet Extractor and REplayer). CERE decomposes applications into small pieces called codelets: each codelet maps to a loop or to an OpenMP parallel region and can be replayed as a standalone program. Codelet autotuning achieves better speedups at a lower tuning cost. By grouping codelet invocations with the same performance behavior, CERE reduces the number of loops or OpenMP regions to be evaluated. Moreover unlike whole-program tuning, CERE customizes the set of best parameters for each specific OpenMP region or loop. We demonstrate CERE tuning of compiler optimizations, number of threads and thread affinity on a NUMA architecture. On average over the NAS 3.0 benchmarks, we achieve a speedup of 1.08x after tuning. Tuning a single codelet is 13x cheaper than whole-program evaluation and estimates the tuning impact on the original region with a 94.7% accuracy. On a Reverse Time Migration (RTM) proto-application we achieve a 1.11x speedup with a 200x cheaper exploration.
引用
收藏
页码:238 / 250
页数:13
相关论文
共 50 条
  • [1] Piecewise holistic autotuning of parallel programs with CERE
    Popov, Mihail
    Akel, Chadi
    Chatelain, Yohan
    Jalby, William
    Castro, Pablo de Oliveira
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (15):
  • [2] SOCRATES - A Seamless Online Compiler and System Runtime AutoTuning Framework for Energy-Aware Applications
    Gadioli, Davide
    Nobre, Ricardo
    Pinto, Pedro
    Vitali, Emanuele
    Ashouri, Amir H.
    Palermo, Gianluca
    Cardoso, Joao
    Silvano, Cristina
    PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 1143 - 1146
  • [3] Autotuning CUDA compiler parameters for heterogeneous applications using the OpenTuner framework
    Bruel, Pedro
    Amaris, Marcos
    Goldman, Alfredo
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (22):
  • [4] A Survey on Compiler Autotuning using Machine Learning
    Ashouri, Amir H.
    Killian, William
    Cavazos, John
    Palermo, Gianluca
    Silvano, Cristina
    ACM COMPUTING SURVEYS, 2019, 51 (05)
  • [5] A Compiler and Runtime for Heterogeneous Computing
    Auerbach, Joshua
    Bacon, David F.
    Burcea, Ioana
    Cheng, Perry
    Fink, Stephen J.
    Rabbah, Rodric
    Shukla, Sunil
    2012 49TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2012, : 271 - 276
  • [6] A SYCL Compiler and Runtime Architecture
    Bader, Alexey
    Brodman, James
    Kinsner, Michael
    PROCEEDINGS OF THE INTERNATIONAL WORKSHOP ON OPENCL (IWOCL'19), 2019,
  • [7] Efficient Compiler Autotuning via Bayesian Optimization
    Chen, Junjie
    Xu, Ningxin
    Chen, Peiqi
    Zhang, Hongyu
    2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, : 1198 - 1209
  • [8] Application Autotuning to Support Runtime Adaptivity in Multicore Architectures
    Gadioli, Davide
    Palermo, Gianluca
    Silvano, Cristina
    PROCEEDINGS INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS - ARCHITECTURES, MODELING AND SIMULATION (SAMOS XV), 2015, : 173 - 180
  • [9] Compiler Autotuning through Multiple-phase Learning
    Zhu, Mingxuan
    Hao, Dan
    Chen, Junjie
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (04)
  • [10] COBAYN: Compiler Autotuning Framework Using Bayesian Networks
    Ashouri, Amir Hossein
    Mariani, Giovanni
    Palermo, Gianluca
    Park, Eunjung
    Cavazos, John
    Silvano, Cristina
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2016, 13 (02)