Augmenting loop tiling with data alignment for improved cache performance

被引:49
|
作者
Panda, PR
Nakamura, H
Dutt, ND
Nicolau, A
机构
[1] Synopsys Inc, Mt View, CA 94043 USA
[2] Univ Tokyo, Adv Sci & Technol Res Ctr, Meguro Ku, Tokyo 1538904, Japan
[3] Univ Calif Irvine, Dept Informat & Comp Sci, Irvine, CA 92697 USA
基金
美国国家科学基金会;
关键词
loop tiling; data cache; data alignment; cache conflict;
D O I
10.1109/12.752655
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Loop blocking (tiling) is a well-known compiler optimization that helps improve cache performance by dividing the loop iteration space into smaller blocks (tiles); reuse of array elements within each tile is maximized by ensuring that the working set for the tile fits into the data cache. Padding is a data alignment technique that involves the insertion of dummy elements into a data structure for improving cache performance. In this work, we present DAT, a technique that augments loop tiling with data alignment, achieving improved efficiency (by ensuring that the cache is never under-utilized) as well as improved flexibility (by eliminating self-interference cache conflicts independent of the tile size). This results in a more stable and better cache performance than existing approaches, in addition to maximizing cache utilization, eliminating Self-interference, and minimizing cross-interference conflicts. Further, while all previous efforts are targetted at programs characterized by the reuse of a single array, we also address the issue of minimizing conflict misses when several tiled arrays are involved. To validate our technique, we ran extensive experiments using both simulations as well as actual measurements on SUN Sparc5 and Sparc10 workstations. The results on benchmarks exhibiting varying memory access patterns demonstrate the effectiveness of our technique through consistently high hit ratios and improved performance across varying problem sizes.
引用
收藏
页码:142 / 149
页数:8
相关论文
共 50 条
  • [1] Augmenting loop tiling with data alignment for improved cache performance
    Synopsys, Inc, Mountain View, United States
    IEEE Trans Comput, 2 (142-148):
  • [2] Improving cache performance through tiling and data alignment
    Panda, PR
    Nakamura, H
    Dutt, ND
    Nicolau, A
    SOLVING IRREGULARLY STRUCTURED PROBLEMS IN PARALLEL, 1997, 1253 : 167 - 185
  • [3] Enabling loop fusion and tiling for cache performance by fixing fusion-preventing data dependences
    Xue, JL
    Huang, QG
    2005 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSSING, PROCEEDINGS, 2005, : 107 - 115
  • [4] Defensive Loop Tiling for Shared Cache
    Bao, Bin
    Ding, Chen
    PROCEEDINGS OF THE 2013 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), 2013, : 324 - 334
  • [5] A data alignment technique for improving cache performance
    Panda, PR
    Nakamura, H
    Dutt, ND
    Nicolau, A
    INTERNATIONAL CONFERENCE ON COMPUTER DESIGN - VLSI IN COMPUTERS AND PROCESSORS, PROCEEDINGS, 1997, : 587 - 592
  • [6] Combining Software Cache Partitioning and Loop Tiling for Effective Shared Cache Management
    Vasilios, Kelefouras
    Georgios, Keramidas
    Nikolaos, Voros
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2018, 17 (03)
  • [7] Code tiling for improving the cache performance of PDE solvers
    Huang, QG
    Xue, JL
    Vera, X
    2003 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDINGS, 2003, : 615 - 624
  • [8] CNN Accelerator Performance Dependence on Loop Tiling and the Optimum Resource-Constrained Loop Tiling
    Park, Chester Sungchung
    Park, Sungkyung
    IEEE ACCESS, 2025, 13 : 16800 - 16810
  • [9] Memory organization for improved data cache performance in embedded processors
    Panda, PR
    Dutt, ND
    Nicolau, A
    9TH INTERNATIONAL SYMPOSIUM ON SYSTEMS SYNTHESIS, PROCEEDINGS, 1996, : 90 - 95
  • [10] Near-optimal loop tiling by means of Cache Miss Equations and genetic algorithms
    Abella, J
    González, A
    Llosa, J
    Vera, X
    2002 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDINGS OF THE WORKSHOPS, 2002, : 568 - 577