Augmenting loop tiling with data alignment for improved cache performance

被引:49
|
作者
Panda, PR
Nakamura, H
Dutt, ND
Nicolau, A
机构
[1] Synopsys Inc, Mt View, CA 94043 USA
[2] Univ Tokyo, Adv Sci & Technol Res Ctr, Meguro Ku, Tokyo 1538904, Japan
[3] Univ Calif Irvine, Dept Informat & Comp Sci, Irvine, CA 92697 USA
基金
美国国家科学基金会;
关键词
loop tiling; data cache; data alignment; cache conflict;
D O I
10.1109/12.752655
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Loop blocking (tiling) is a well-known compiler optimization that helps improve cache performance by dividing the loop iteration space into smaller blocks (tiles); reuse of array elements within each tile is maximized by ensuring that the working set for the tile fits into the data cache. Padding is a data alignment technique that involves the insertion of dummy elements into a data structure for improving cache performance. In this work, we present DAT, a technique that augments loop tiling with data alignment, achieving improved efficiency (by ensuring that the cache is never under-utilized) as well as improved flexibility (by eliminating self-interference cache conflicts independent of the tile size). This results in a more stable and better cache performance than existing approaches, in addition to maximizing cache utilization, eliminating Self-interference, and minimizing cross-interference conflicts. Further, while all previous efforts are targetted at programs characterized by the reuse of a single array, we also address the issue of minimizing conflict misses when several tiled arrays are involved. To validate our technique, we ran extensive experiments using both simulations as well as actual measurements on SUN Sparc5 and Sparc10 workstations. The results on benchmarks exhibiting varying memory access patterns demonstrate the effectiveness of our technique through consistently high hit ratios and improved performance across varying problem sizes.
引用
收藏
页码:142 / 149
页数:8
相关论文
共 50 条
  • [21] Zero cost indexing for improved processor cache performance
    Givargis, T
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2006, 11 (01) : 3 - 25
  • [22] Cache Aging Reduction with Improved Performance using Dynamically Re-sizable Cache
    Mahmood, Haroon
    Poncino, Massimo
    Macii, Enrico
    2014 DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION (DATE), 2014,
  • [23] Improve Performance of Data Warehouse by Query Cache
    Gour, Vishal
    Sarangdevot, S. S.
    Sharma, Anand
    Choudhary, Vinod
    INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN SCIENCE AND TECHNOLOGY (ICM2ST-10), 2010, 1324 : 198 - +
  • [24] On Performance of Cache Policies in Named Data Networking
    Ran, Jianhua
    Lv, Na
    Zhang, Ding
    Ma, Yuanyuan
    Xie, Zhenyong
    PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND ELECTRONICS INFORMATION (ICACSEI 2013), 2013, 41 : 668 - 671
  • [25] Improving Last Level Cache Locality by Integrating Loop and Data Transformations
    Ding, Wei
    Kandemir, Mahmut
    2012 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2012, : 65 - 72
  • [26] REDUCTION OF CACHE COHERENCE OVERHEAD BY COMPILER DATA LAYOUT AND LOOP TRANSFORMATION
    JU, YJ
    DIETZ, H
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 589 : 344 - 358
  • [27] Loop pinpoints of Cache side channel attacks from a performance analysis
    Peng S.
    Zhao J.
    Han J.
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2020, 60 (06): : 449 - 455
  • [28] Write Mode Aware Loop Tiling for High Performance Low Power Volatile PCM
    Qiu, Keni
    Li, Qingan
    Xue, Chun Jason
    2014 51ST ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2014,
  • [29] PERFORMANCE OF HASHED CACHE DATA MIGRATION SCHEMES ON MULTICOMPUTERS
    HIRANANDANI, S
    SALTZ, J
    MEHROTRA, P
    BERRYMAN, H
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1991, 12 (04) : 415 - 422
  • [30] Design considerations of high performance data cache with prefetching
    Chi, CH
    Yuan, YL
    EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 1243 - 1250