Automating Compiler-Directed Autotuning for Phased Performance Behavior

被引:1
|
作者
Rusira, Tharindu [1 ]
Hall, Mary [1 ]
Basu, Protonu [2 ]
机构
[1] Univ Utah, Sch Comp, Salt Lake City, UT 84112 USA
[2] Lawrence Berkeley Natl Lab, Computat Res Div, Berkeley, CA 94720 USA
来源
2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW) | 2017年
基金
美国国家科学基金会;
关键词
autotuning; compiler; geometric multigrid; stencil; high performance; code generation;
D O I
10.1109/IPDPSW.2017.152
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We describe an integration of the CHiLL compiler with OpenTuner to reduce the programmer burden in using autotuning. We use as a case study optimizing the smooth operator and its associated stencil computations in the context of Geometric Multigrid (GMG), a hierarchical linear solver that operates in multiple grid resolutions (levels). Smooth is the most performance-critical operation that runs multiple times at each grid level and effectively performs a relaxation of the approximated solution at a given grid resolution. This computation poses a particular challenge for autotuning, as the desired optimization strategy varies at different grid resolutions within the same application execution. Even though the compiler provides a number of standard and domain-specific optimizations for stencil computations, it is challenging for a programmer to decide which optimizations to perform and implement all the steps of the autotuning search. In this paper, we make the following contributions to simplify this process and make it possible to configure the application for its different phases: (1) we provide an interface (called a superscript) to concisely describe a search space and automatically generate CHiLL transformation recipes; and, (2) we use OpenTuner tailored to CHiLL transformation recipes to employ sophisticated heuristic algorithms that manage the computational complexity of search. We demonstrate performance that far exceeds that of fixed optimization strategies, while only sampling a tiny subset of the autotuning search space.
引用
收藏
页码:1362 / 1371
页数:10
相关论文
共 50 条
  • [21] Compiler-Directed Leakage Reduction in Embedded Microprocessors
    Roy, Soumyaroop
    Ranganathan, Nagarajan
    Katkoori, Srinivas
    2009 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, 2009, : 35 - 40
  • [22] Compiler-directed selection of dynamic memory layouts
    Kandemir, M
    Kadayif, I
    PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON HARDWARE/SOFTWARE CODESIGN, 2001, : 219 - 224
  • [23] Compiler-directed memory management for heterogeneous MPSoCs
    Wang, Miao
    Bodin, Francois
    JOURNAL OF SYSTEMS ARCHITECTURE, 2011, 57 (01) : 134 - 145
  • [24] Compiler-Directed Whole-System Persistence
    Zeng, Jianping
    Zhang, Tong
    Jung, Changhee
    2024 ACM/IEEE 51ST ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2024, 2024, : 961 - 977
  • [25] Compiler-directed instruction cache leakage optimization
    Zhang, W
    Hu, JS
    Degalahal, V
    Kandemir, M
    Vijaykrishnan, N
    Irwin, MJ
    35TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-35), PROCEEDINGS, 2002, : 208 - 218
  • [26] Improving I/O performance of applications through compiler-directed code restructuring
    Kandemir, Mahmut
    Son, Seung Woo
    Karakoy, Mustafa
    PROCEEDINGS OF THE 6TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES (FAST '08), 2008, : 159 - +
  • [27] Compiler-Directed High-Performance Intermittent Computation with Power Failure Immunity
    Choi, Jongouk
    Kittinger, Larry
    Liu, Qingrui
    Jung, Changhee
    2022 IEEE 28TH REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM (RTAS), 2022, : 40 - 54
  • [28] Compiler-Directed Soft Error Mitigation for Embedded Systems
    Martinez-Alvarez, Antonio
    Cuenca-Asensi, Sergio A.
    Restrepo-Calle, Felipe
    Palomo Pinto, Francisco R.
    Guzman-Miranda, Hipolito
    Aguirre, Miguel A.
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2012, 9 (02) : 159 - 172
  • [29] A Compiler-Directed Data Prefetching Scheme for Chip Multiprocessors
    Son, Seung Woo
    Kandemir, Mahmut
    Karakoy, Mustafa
    Chakrabarti, Dhruva
    ACM SIGPLAN NOTICES, 2009, 44 (04) : 209 - 218
  • [30] Compiler-directed early load-address generation
    Cheng, BC
    Connors, DA
    Hwu, WMW
    31ST ANNUAL ACM/IEEE INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1998, : 138 - 147