Automating Compiler-Directed Autotuning for Phased Performance Behavior

被引:1
|
作者
Rusira, Tharindu [1 ]
Hall, Mary [1 ]
Basu, Protonu [2 ]
机构
[1] Univ Utah, Sch Comp, Salt Lake City, UT 84112 USA
[2] Lawrence Berkeley Natl Lab, Computat Res Div, Berkeley, CA 94720 USA
来源
2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW) | 2017年
基金
美国国家科学基金会;
关键词
autotuning; compiler; geometric multigrid; stencil; high performance; code generation;
D O I
10.1109/IPDPSW.2017.152
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We describe an integration of the CHiLL compiler with OpenTuner to reduce the programmer burden in using autotuning. We use as a case study optimizing the smooth operator and its associated stencil computations in the context of Geometric Multigrid (GMG), a hierarchical linear solver that operates in multiple grid resolutions (levels). Smooth is the most performance-critical operation that runs multiple times at each grid level and effectively performs a relaxation of the approximated solution at a given grid resolution. This computation poses a particular challenge for autotuning, as the desired optimization strategy varies at different grid resolutions within the same application execution. Even though the compiler provides a number of standard and domain-specific optimizations for stencil computations, it is challenging for a programmer to decide which optimizations to perform and implement all the steps of the autotuning search. In this paper, we make the following contributions to simplify this process and make it possible to configure the application for its different phases: (1) we provide an interface (called a superscript) to concisely describe a search space and automatically generate CHiLL transformation recipes; and, (2) we use OpenTuner tailored to CHiLL transformation recipes to employ sophisticated heuristic algorithms that manage the computational complexity of search. We demonstrate performance that far exceeds that of fixed optimization strategies, while only sampling a tiny subset of the autotuning search space.
引用
收藏
页码:1362 / 1371
页数:10
相关论文
共 50 条
  • [1] Compiler-directed classification of value locality Behavior
    Zhao, Q
    Lilja, DJ
    2001 INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD 2001, PROCEEDINGS, 2001, : 240 - 248
  • [2] Performance potentials of compiler-directed data speculation
    Wu, YF
    Chen, LL
    Ju, R
    Fang, J
    ISPASS: 2003 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, 2003, : 22 - 31
  • [3] Compiler-directed code restructuring for improving performance of MPSoCs
    Chen, Guilin
    Kandemir, Mahmut
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2008, 19 (09) : 1201 - 1214
  • [4] Compiler-Directed Performance Model Construction for Parallel Programs
    Schindewolf, Martin
    Kramer, David
    Cintra, Marcelo
    ARCHITECTURE OF COMPUTING SYSTEMS - ARCS 2010, PROCEEDINGS, 2010, 5974 : 187 - +
  • [5] Compiler-directed cache polymorphism
    Hu, JS
    Kandemir, M
    Vijaykrishnan, N
    Irwin, MJ
    Saputra, H
    Zhang, W
    ACM SIGPLAN NOTICES, 2002, 37 (07) : 165 - 174
  • [6] Compiler-directed scratchpad memory management
    Xue, JL
    EMBEDDED SOFTWARE AND SYSTEMS, PROCEEDINGS, 2005, 3820 : 2 - 2
  • [7] Compiler-Directed Page Coloring for Multiprocessors
    Bugnion, E.
    Anderson, J. M.
    Mowry, T. C.
    Rosenblum, M.
    Computer Architecture News, 24
  • [8] Compiler-directed management of instruction accesses
    Chen, G
    Chen, G
    Kadayif, I
    Zhang, W
    Kandemir, M
    Kolcu, I
    Sezer, U
    EUROMICRO SYMPOSIUM ON DIGITAL SYSTEM DESIGN, PROCEEDINGS, 2003, : 459 - 462
  • [9] Techniques for compiler-directed cache coherence
    Choi, L
    Lim, HB
    Yew, PC
    IEEE PARALLEL & DISTRIBUTED TECHNOLOGY, 1996, 4 (04): : 23 - &
  • [10] Compiler-directed page coloring for multiprocessors
    Stanford Univ, Stanford, CA, United States
    Comput Archit News, Special Issu (244-255):