Accelerated Auto-Tuning of GPU Kernels for Tensor Computations

被引:0
|
作者
Li, Chendi [1 ]
Xu, Yufan [1 ]
Saravani, Sina Mahdipour [1 ]
Sadayappan, P. [1 ]
机构
[1] Univ Utah, Salt Lake City, UT 84112 USA
基金
美国国家科学基金会;
关键词
Auto-tuning; Design space exploration; GPU kernel optimization; Neural networks; Performance modeling; Tile-size optimization;
D O I
10.1145/3650200.3656626
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
TVM is a state-of-the-art auto-tuning compiler for the synthesis of high-performance implementations of tensor computations. However, an extensive search in the vast design space via thousands of compile-execute trials is often needed to identify high-performance code versions, leading to high auto-tuning time. This paper develops new performance modeling and design space exploration strategies to accelerate the code optimization process within TVM. Experimental evaluation on a number of matrix-matrix multiplication and 2D convolution kernels demonstrates about an order-of-magnitude improvement in auto-tuning time to achieve the same level of code performance.
引用
收藏
页码:549 / 561
页数:13
相关论文
共 50 条
  • [21] Historic Learning Approach for Auto-tuning OpenACC Accelerated Scientific Applications
    Siddiqui, Shahzeb
    AlZayer, Fatemah
    Feki, Saber
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2014, 2015, 8969 : 224 - 235
  • [22] Meta-programming and Auto-tuning in the Search for High Performance GPU Code
    Vollmer, Michael
    Svensson, Bo Joel
    Holk, Eric
    Newton, Ryan R.
    FHPC'15 PROCEEDINGS OF THE 4TH ACM SIGPLAN WORKSHOP ON FUNCTIONAL HIGH-PERFORMANCE COMPUTING, 2015, : 1 - 11
  • [23] GPU-FPtuner: Mixed-precision Auto-tuning for Floating-point Applications on GPU
    Gu, Ruidong
    Becchi, Michela
    2020 IEEE 27TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC 2020), 2020, : 294 - 304
  • [24] Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF)
    Rasch, Ari
    Schulze, Richard
    Steuwer, Michel
    Gorlatch, Sergei
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2021, 18 (01)
  • [25] Auto-Tuning for Military Microgrids
    Podlesak, Thomas
    Vitale, Joseph
    Wilson, Blane
    Bohn, Frank
    Gonzalez, Michael
    Bosse, Richard
    Siegfried, Stefan
    Lynch, Jaclyn
    Barnhill, William
    2019 IEEE ENERGY CONVERSION CONGRESS AND EXPOSITION (ECCE), 2019, : 6270 - 6277
  • [26] The interpolation method for auto-tuning
    Skvortsov, L.M.
    Shuiyun Gongcheng/Port & Waterway Engineering, 1998, (09):
  • [27] DOPpler: Parallel Measurement Infrastructure for Auto-Tuning Deep Learning Tensor Programs
    Borowiec, Damian
    Yeung, Gingfung
    Friday, Adrian
    Harper, Richard
    Garraghan, Peter
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (07) : 2208 - 2220
  • [28] Benefits of auto-tuning VFDs
    Avery, Paul
    Control Engineering, 2021, 68 (09)
  • [29] AUTO-TUNING PARALLEL SKELETONS
    Collins, Alexander
    Fensch, Christian
    Leather, Hugh
    PARALLEL PROCESSING LETTERS, 2012, 22 (02)
  • [30] Least squares auto-tuning
    Barratt, Shane T.
    Boyd, Stephen P.
    ENGINEERING OPTIMIZATION, 2021, 53 (05) : 789 - 810