Loop tiling for optimization of locality and parallelism

被引:0
|
作者
Liu, Song [1 ]
Wu, Weiguo [1 ]
Zhao, Bo [1 ]
Jiang, Qing [1 ]
机构
[1] School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an,710049, China
关键词
Economic and social effects - Memory architecture - Optimal systems - Codes (symbols) - Ion beams - Iterative methods;
D O I
10.7544/issn1000-1239.2015.20131387
中图分类号
学科分类号
摘要
Loop tiling is a widely used loop transformation for exposing/exploiting parallelism and data locality in modern computer architecture. It is mainly divided into two categories: fixed and parameterized. These two types of tiling technologies are systematically summarized and their advantages and disadvantages are analyzed comprehensively. Since the tile size would significantly affect the performance of the tiled code, various methods of optimal tile size selection are described. Besides, various kinds of technologies applied to multi-level tiling, parallelism exploration and imperfectly nested loops are surveyed in this paper. Based on the detailed analysis of the current researches on loop tiling technologies, several conclusions are drawn as follows: 1) How to balance the trade-off between computation complexity and generation efficiency of tiled code has not been completely solved, and how to use loop boundaries to efficiently bound the iteration spaces for data locality enhancement also needs further study. 2) Optimal tile size selection is still a difficult and open question, and it would be significant to understand the influence of different level tile size in hierarchical memory system on performance. 3) From the perspective of application, how to automatically generate effective tiled code for arbitrarily nested loops needs further research. On the other hand, how to take full advantage of shared hierarchical memory and multi-core architectures to achieve high degree of parallelism for tiled code is another interesting direction. ©, 2015, Science Press. All right reserved.
引用
收藏
页码:1160 / 1176
相关论文
共 50 条
  • [41] SMARTS: Exploiting temporal locality and parallelism through vertical execution
    Vajracharya, Suvas
    Karmesin, Steve
    Beckman, Peter
    Crotinger, James
    Malony, Allen
    Shende, Sameer
    Oldehoeft, Rod
    Smith, Stephen
    Proceedings of the International Conference on Supercomputing, 1999, : 302 - 310
  • [42] Global optimisation for parallelism and locality in image synthesis parallel system
    Ionescu, F
    CAS '96 PROCEEDINGS - 1996 INTERNATIONAL SEMICONDUCTOR CONFERENCE, 19TH EDITION, VOLS 1 AND 2, 1996, : 171 - 174
  • [43] Balancing DRAM Locality and Parallelism in Shared Memory CMP Systems
    Jeong, Min Kyu
    Yoon, Doe Hyun
    Sunwoo, Dam
    Sullivan, Mike
    Lee, Ikhwan
    Erez, Mattan
    2012 IEEE 18TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2012, : 53 - 64
  • [44] An overview on loop tiling techniques for code generation
    Hammami, Emna
    Slama, Yosr
    2017 IEEE/ACS 14TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2017, : 280 - 287
  • [45] Jagged Tiling for Intra-tile Parallelism and Fine-Grain Multithreading
    Shrestha, Sunil
    Manzano, Joseph
    Marquez, Andres
    Feo, John
    Gao, Guang R.
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING (LCPC 2014), 2015, 8967 : 161 - 175
  • [46] Locality in Network Optimization
    Rebeschini, Patrick
    Tatikonda, Sekhar
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2019, 6 (02): : 487 - 500
  • [47] Orientability of loop processes in relative locality
    Chen, Lin-Qing
    PHYSICAL REVIEW D, 2013, 88 (12):
  • [48] How to quantify loop nest locality
    McKinley, KS
    Temam, O
    PERFORMANCE EVALUATION AND BENCHMARKING WITH REALISTIC APPLICATIONS, 2001, : 50 - 76
  • [49] A quantitative analysis of loop nest locality
    McKinley, KS
    Temam, O
    ACM SIGPLAN NOTICES, 1996, 31 (09) : 94 - 104
  • [50] Transforming Complex Loop Nests for Locality
    Qing Yi
    Ken Kennedy
    Vikram Adve
    The Journal of Supercomputing, 2004, 27 : 219 - 264