Dynamic and Speculative Polyhedral Parallelization Using Compiler-Generated Skeletons

被引:0
|
作者
Alexandra Jimborean
Philippe Clauss
Jean-François Dollinger
Vincent Loechner
Juan Manuel Martinez Caamaño
机构
[1] University of Uppsala,UPMARC
[2] University of Strasbourg,ICube, INRIA, CNRS
关键词
Algorithmic skeletons; Polytope model; Automatic parallelization; Speculative parallelization; Dynamic parallelization; Loop nests; Compilation;
D O I
暂无
中图分类号
学科分类号
摘要
We propose a framework based on an original generation and use of algorithmic skeletons, and dedicated to speculative parallelization of scientific nested loop kernels, able to apply at run-time polyhedral transformations to the target code in order to exhibit parallelism and data locality. Parallel code generation is achieved almost at no cost by using binary algorithmic skeletons that are generated at compile-time, and that embed the original code and operations devoted to instantiate a polyhedral parallelizing transformation and to verify the speculations on dependences. The skeletons are patched at run-time to generate the executable code. The run-time process includes a transformation selection guided by online profiling phases on short samples, using an instrumented version of the code. During this phase, the accessed memory addresses are used to compute on-the-fly dependence distance vectors, and are also interpolated to build a predictor of the forthcoming accesses. Interpolating functions and distance vectors are then employed for dependence analysis to select a parallelizing transformation that, if the prediction is correct, does not induce any rollback during execution. In order to ensure that the rollback time overhead stays low, the code is executed in successive slices of the outermost original loop of the nest. Each slice can be either a parallel version which instantiates a skeleton, a sequential original version, or an instrumented version. Moreover, such slicing of the execution provides the opportunity of transforming differently the code to adapt to the observed execution phases, by patching differently one of the pre-built skeletons. The framework has been implemented with extensions of the LLVM compiler and an x86-64 runtime system. Significant speed-ups are shown on a set of benchmarks that could not have been handled efficiently by a compiler.
引用
收藏
页码:529 / 545
页数:16
相关论文
共 50 条
  • [41] Nested Loop Parallelization Using Polyhedral Optimization in High-Level Synthesis
    Suda, Akihiro
    Takase, Hideki
    Takagi, Kazuyoshi
    Takagi, Naofumi
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2014, E97A (12) : 2498 - 2506
  • [42] Loop Parallelization using Dynamic Commutativity Analysis
    Vasiladiotis, Christos
    Lozano, Roberto Castaneda
    Cole, Murray
    Franke, Bjorn
    CGO '21: PROCEEDINGS OF THE 2021 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), 2021, : 150 - 161
  • [43] Design formalism for DNA self-assembly of polyhedral skeletons using rigid tiles
    Ferrari, Margherita Maria
    Cook, Anna
    Houlihan, Alana
    Rouleau, Rebecca
    Seeman, Nadrian C.
    Pangborn, Greta
    Ellis-Monaghan, Joanna
    JOURNAL OF MATHEMATICAL CHEMISTRY, 2018, 56 (05) : 1365 - 1392
  • [44] Design formalism for DNA self-assembly of polyhedral skeletons using rigid tiles
    Margherita Maria Ferrari
    Anna Cook
    Alana Houlihan
    Rebecca Rouleau
    Nadrian C. Seeman
    Greta Pangborn
    Joanna Ellis-Monaghan
    Journal of Mathematical Chemistry, 2018, 56 : 1365 - 1392
  • [45] Nonlinear Dynamic Analysis Efficiency by Using a GPU Parallelization
    Li, Hong-yu
    Teng, Jun
    Li, Zuo-hua
    Zhang, Lu
    ENGINEERING LETTERS, 2015, 23 (04) : 232 - 238
  • [46] Using Genetic Algorithm in Dynamic Model of Speculative Attack
    Gawronska-Nowak, Bogna
    Grabowski, Wojciech
    EQUILIBRIUM-QUARTERLY JOURNAL OF ECONOMICS AND ECONOMIC POLICY, 2016, 11 (02): : 287 - 306
  • [47] Automatic and Interactive Program Parallelization Using the Cetus Source to Source Compiler Infrastructure v2.0
    Bhosale, Akshay
    Barakhshan, Parinaz
    Rosas, Miguel Romero
    Eigenmann, Rudolf
    ELECTRONICS, 2022, 11 (05)
  • [48] Parallelization of Automotive Engine Control Software On Embedded Multi-core Processor Using OSCAR Compiler
    Kanehagi, Yohei
    Umeda, Dan
    Hayashi, Akihiro
    Kimura, Keiji
    Kasahara, Hironori
    2013 IEEE COOL CHIPS XVI (COOL CHIPS), 2013,
  • [49] Retargeting JIT compilers by using c-compiler generated executable code
    Ertl, MA
    Gregg, D
    13TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION TECHNIQUES, PROCEEDINGS, 2004, : 41 - 50
  • [50] Using Reservoir Sampling and Parallelization to Improve Dynamic Binary Instrumentation
    Upp, Brandon
    Meruga, Sai Pavan Kumar
    Hill, James H.
    2022 IEEE 25TH INTERNATIONAL SYMPOSIUM ON REAL-TIME DISTRIBUTED COMPUTING (ISORC 2022), 2022, : 27 - 33