Optimized unrolling of nested loops

被引:23
|
作者
Sarkar, V [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
loop transformations; loop unrolling; unroll-and-jam; unroll factors;
D O I
10.1023/A:1012246031671
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Loop unrolling is a well known loop transformation that has been used in optimizing compilers for over three decades. In this paper, we address the problems of automatically selecting unroll factors for perfectly nested loops, and generating compact code for the selected unroll factors. Compared to past work, the contributions of our work include (i) a more detailed cost model that includes register locality, instruction-level parallelism and instruction-cache considerations; (ii) a new code generation algorithm that generates more compact code than the unroll-and-jam transformation; and (iii) a new algorithm for efficiently enumerating feasible unroll vectors. Our experimental results confirm the wide applicability of our approach by showing a 2.2 x speedup on matrix multiply, and an average 1.08 x speedup on seven of the SPEC95fp benchmarks (with a 1.2 x speedup for two benchmarks). Larger performance improvements can be expected on processors that have larger numbers of registers and larger degrees of instruction-level parallelism than the processor used for our measurements (PowerPC 604).
引用
收藏
页码:545 / 581
页数:37
相关论文
共 50 条
  • [41] Optimizing nested loops with iterational and instructional retiming
    Xue, C
    Shao, ZL
    Liu, ML
    Qiu, MK
    Sha, EHM
    EMBEDDED AND UBIQUITOUS COMPUTING - EUC 2005, 2005, 3824 : 164 - 173
  • [42] ANALYSIS OF PARALLELISM IN NESTED DO LOOPS.
    Foulk, Patrick W.
    Nassar, Salwa M.
    1600, (05):
  • [43] Generation of Efficient Nested Loops from Polyhedra
    Fabien Quilleré
    Sanjay Rajopadhye
    Doran Wilde
    International Journal of Parallel Programming, 2000, 28 : 469 - 498
  • [44] On loop transformations of nested loops with affine dependencies
    Popp, A
    Zimmermann, KH
    COMPUTER PHYSICS COMMUNICATIONS, 2001, 139 (01) : 90 - 103
  • [45] PARALLELIZING NESTED LOOPS ON MULTICOMPUTERS - THE GROUPING APPROACH
    KING, CT
    KAU, IR
    PROCEEDINGS : THE THIRTEENTH ANNUAL INTERNATIONAL COMPUTER SOFTWARE & APPLICATIONS CONFERENCE, 1989, : 136 - 142
  • [46] A new code transformation technique for nested loops
    Simecek, Ivan
    Tvrdik, Pavel
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2014, 11 (04) : 1381 - 1416
  • [47] Students’ Understanding of Loops and Nested Loops in Computer Programming: An APOS Theory Perspective
    Cetin I.
    Canadian Journal of Science, Mathematics and Technology Education, 2015, 15 (2) : 155 - 170
  • [48] LOOPS: LOcally Optimized Polygon Simplification
    Amiraghdam, Alireza
    Diehl, Alexandra
    Pajarola, Renato
    COMPUTER GRAPHICS FORUM, 2022, 41 (03) : 355 - 365
  • [49] A NEW APPROACH TO SCHEDULE OPERATIONS ACROSS NESTED-IFS AND NESTED-LOOPS
    HUANG, SH
    HWANG, CT
    HSU, YC
    OYANG, YJ
    MICROPROCESSING AND MICROPROGRAMMING, 1995, 41 (01): : 37 - 52
  • [50] A BSP approach to the scheduling of tightly-nested loops
    Calinescu, R
    11TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM, PROCEEDINGS, 1997, : 549 - 553