Piecewise Holistic Autotuning of Compiler and Runtime Parameters

被引:4
|
作者
Popov, Mihail [1 ]
Akel, Chadi [2 ]
Jalby, William [1 ]
Castro, Pablo de Oliveira [1 ]
机构
[1] Univ Versailles St Quentin En Yvelines, Univ Paris Saclay, Versailles, France
[2] Exascale Comp Res, Versailles, France
来源
关键词
CODE;
D O I
10.1007/978-3-319-43659-3_18
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Current architecture complexity requires fine tuning of compiler and runtime parameters to achieve full potential performance. Auto-tuning substantially improves default parameters in many scenarios but it is a costly process requiring a long iterative evaluation. We propose an automatic piecewise autotuner based on CERE (Codelet Extractor and REplayer). CERE decomposes applications into small pieces called codelets: each codelet maps to a loop or to an OpenMP parallel region and can be replayed as a standalone program. Codelet autotuning achieves better speedups at a lower tuning cost. By grouping codelet invocations with the same performance behavior, CERE reduces the number of loops or OpenMP regions to be evaluated. Moreover unlike whole-program tuning, CERE customizes the set of best parameters for each specific OpenMP region or loop. We demonstrate CERE tuning of compiler optimizations, number of threads and thread affinity on a NUMA architecture. On average over the NAS 3.0 benchmarks, we achieve a speedup of 1.08x after tuning. Tuning a single codelet is 13x cheaper than whole-program evaluation and estimates the tuning impact on the original region with a 94.7% accuracy. On a Reverse Time Migration (RTM) proto-application we achieve a 1.11x speedup with a 200x cheaper exploration.
引用
收藏
页码:238 / 250
页数:13
相关论文
共 50 条
  • [21] Compiler Generation and Autotuning of Communication-Avoiding Operators for Geometric Multigrid
    Basu, Protonu
    Venkat, Anand
    Hall, Mary
    Williams, Samuel
    Van Straalen, Brian
    Oliker, Leonid
    2013 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2013, : 452 - 461
  • [22] Krill: A Compiler and Runtime System for Concurrent Graph Processing
    Chen, Hongzheng
    Shen, Minghua
    Xiao, Nong
    Lu, Yutong
    SC21: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2021,
  • [23] Offload Compiler Runtime for the Intel® Xeon Phi™ Coprocessor
    Newburn, Chris J.
    Deodhar, Rajiv
    Dmitriev, Serguei
    Murty, Ravi
    Narayanaswamy, Ravi
    Wiegert, John
    Chinchilla, Francisco
    McGuire, Russell
    SUPERCOMPUTING (ISC 2013), 2013, 7905 : 239 - 254
  • [24] Building a flexible JAVA runtime upon a flexible compiler
    Thomas, Gael
    Ogel, Frederic
    Galland, Antonine
    Folliot, B.
    Piumarta, I.
    International Journal of Computers and Applications, 2005, 27 (01) : 27 - 33
  • [25] CoreDet: A Compiler and Runtime System for Deterministic Multithreaded Execution
    Bergan, Tom
    Anderson, Owen
    Devietti, Joseph
    Ceze, Luis
    Grossman, Dan
    ACM SIGPLAN NOTICES, 2010, 45 (03) : 53 - 64
  • [26] Compiler and runtime techniques for software transactional memory optimization
    Wu, Peng
    Michael, Maged M.
    von Praun, Christoph
    Nakalke, Takuya
    Bordawekar, Rajesh
    Cain, Harold W.
    Cascaval, Calin
    Chatterjee, Siddhartha
    Chiras, Stefame
    Hou, Rui
    Mergen, Mark
    Shen, Xiaowei
    Spear, Michael F.
    Wang, Hua Yong
    Wang, Kun
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2009, 21 (01): : 7 - 23
  • [27] CoreDet: A Compiler and Runtime System for Deterministic Multithreaded Execution
    Bergan, Tom
    Anderson, Owen
    Devietti, Joseph
    Ceze, Luis
    Grossman, Dan
    ASPLOS XV: FIFTEENTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, 2010, : 53 - 64
  • [28] CEDR: A Compiler-integrated, Extensible DSSoC Runtime
    Mack, Joshua
    Hassan, Sahil
    Kumbhare, Nirmal
    Gonzalez, Miguel Castro
    Akoglu, Ali
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (02)
  • [29] Compiler and runtime support for efficient software transactional memory
    Adl-Tabatabai, Ali-Reza
    Lewis, Brian T.
    Menon, Vijay
    Murphy, Brian R.
    Saha, Bratin
    Shpeisman, Tatiana
    ACM SIGPLAN NOTICES, 2006, 41 (06) : 26 - 37
  • [30] Oberon script: A lightweight compiler and runtime system for the web
    Sommerer, Ralph
    MODULAR PROGRAMMING LANGUAGES, PROCEEDINGS, 2006, 4228 : 73 - 83