CREPE: Concurrent Reverse-Modulo-Scheduling and Placement for CGRAs

被引:0
|
作者
Sunny, Chilankamol [1 ]
Das, Satyajit [1 ]
Martin, Kevin J. M. [2 ]
Coussy, Philippe [2 ]
机构
[1] IIT Palakkad, Palakkad 678623, Kerala, India
[2] Univ Bretagne Sud, UMR 6285, Lab STICC, F-56100 Lorient, France
关键词
Schedules; Routing; Kernel; Space exploration; Registers; Heuristic algorithms; Dynamic scheduling; Coarse-grained reconfigurable array (CGRA); modulo-scheduling; loop optimization; LOOPS;
D O I
10.1109/TPDS.2024.3402098
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Coarse-Grained Reconfigurable Array (CGRA) architectures are popular as high-performance and energy-efficient computing devices. Compute-intensive loop constructs of complex applications are mapped onto CGRAs by modulo-scheduling the innermost loop dataflow graph (DFG). In the state-of-the-art approaches, mapping quality is typically determined by initiation interval (II), while schedule length for one iteration is neglected. However, for nested loops, schedule length becomes important. In this article, we propose CREPE, a Concurrent Reverse-modulo-scheduling and Placement technique for CGRAs that minimizes both II and schedule length. CREPE performs simultaneous modulo-scheduling and placement coupled with dynamic graph transformations, generating good-quality mappings with high success rates. Furthermore, we introduce a compilation flow that maps nested loops onto the CGRA and modulo-schedules the innermost loop using CREPE. Experiments show that the proposed solution outperforms the conventional approaches in mapping success rate and total execution time with no impact on the compilation time. CREPE maps all kernels considered while state-of-the-art techniques Crimson and Epimap failed to find a mapping or mapped at very high IIs. On a 2x4 CGRA, CREPE reports a 100% success rate and a speed-up up to 5.9x and 1.4x over Crimson with 78.5% and Epimap with 46.4% success rates respectively.
引用
收藏
页码:1293 / 1306
页数:14
相关论文
共 3 条
  • [1] Joint Modulo Scheduling and Vdd Assignment for Loop Mapping on Dual-Vdd CGRAs
    Yin, Shouyi
    Gu, Jiangyuan
    Liu, Dajiang
    Liu, Leibo
    Wei, Shaojun
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2016, 35 (09) : 1475 - 1488
  • [2] SAT-based Exact Modulo Scheduling Mapping for Resource-Constrained CGRAs
    Tirelli, Cristian
    Sapriza, Juan
    Alvarez, Ruben Podriguez
    Ferretti, Lorenzo
    Denkinger, Benoit
    Ansaloni, Giovanni
    Calero, Jose Miranda
    Atienza, David
    Pozzi, Laura
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2024, 20 (03)
  • [3] CRIMSON: Compute-Intensive Loop Acceleration by Randomized Iterative Modulo Scheduling and Optimized Mapping on CGRAs
    Balasubramanian, Mahesh
    Shrivastava, Aviral
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (11) : 3300 - 3310