Loop tiling for optimization of locality and parallelism

被引:0
|
作者
Liu, Song [1 ]
Wu, Weiguo [1 ]
Zhao, Bo [1 ]
Jiang, Qing [1 ]
机构
[1] School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an,710049, China
关键词
Economic and social effects - Memory architecture - Optimal systems - Codes (symbols) - Ion beams - Iterative methods;
D O I
10.7544/issn1000-1239.2015.20131387
中图分类号
学科分类号
摘要
Loop tiling is a widely used loop transformation for exposing/exploiting parallelism and data locality in modern computer architecture. It is mainly divided into two categories: fixed and parameterized. These two types of tiling technologies are systematically summarized and their advantages and disadvantages are analyzed comprehensively. Since the tile size would significantly affect the performance of the tiled code, various methods of optimal tile size selection are described. Besides, various kinds of technologies applied to multi-level tiling, parallelism exploration and imperfectly nested loops are surveyed in this paper. Based on the detailed analysis of the current researches on loop tiling technologies, several conclusions are drawn as follows: 1) How to balance the trade-off between computation complexity and generation efficiency of tiled code has not been completely solved, and how to use loop boundaries to efficiently bound the iteration spaces for data locality enhancement also needs further study. 2) Optimal tile size selection is still a difficult and open question, and it would be significant to understand the influence of different level tile size in hierarchical memory system on performance. 3) From the perspective of application, how to automatically generate effective tiled code for arbitrarily nested loops needs further research. On the other hand, how to take full advantage of shared hierarchical memory and multi-core architectures to achieve high degree of parallelism for tiled code is another interesting direction. ©, 2015, Science Press. All right reserved.
引用
收藏
页码:1160 / 1176
相关论文
共 50 条
  • [31] Exploring Data Parallelism and Locality in Wide Area Networks
    Gu, Yunhong
    Grossman, Robert
    2008 WORKSHOP ON MANY-TASK COMPUTING ON GRIDS AND SUPERCOMPUTERS, 2008, : 1 - 10
  • [32] An Analytical Model for Loop Tiling Transformation
    Kelefouras, Vasilios
    Djemame, Karim
    Keramidas, Georgios
    Voros, Nikolaos
    EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2021, 2022, 13227 : 95 - 107
  • [33] GLOBAL OPTIMIZATIONS FOR PARALLELISM AND LOCALITY ON SCALABLE PARALLEL MACHINES
    ANDERSON, JM
    LAM, MS
    SIGPLAN NOTICES, 1993, 28 (06): : 112 - 125
  • [34] PARALLELISM AND QUERY OPTIMIZATION
    ZIANE, M
    ZAIT, M
    QUANG, HH
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 1995, 10 (01): : 50 - 56
  • [35] Exploiting Loop Parallelism with Redundant Execution
    唐卫宇
    施武
    臧斌宇
    朱传琪
    Journal of Computer Science and Technology, 1997, (02) : 105 - 112
  • [36] Exploiting loop parallelism with redundant execution
    Fudan Univ, Shanghai, China
    J Comput Sci Technol, 2 (105-112):
  • [37] ReLooper: Refactoring for loop parallelism in Java
    University of Illinois, United States
    不详
    Proc Conf Object Orient Program Syst Lang Appl OOPSLA, 1600, (793-794):
  • [38] Exploiting loop parallelism with redundant execution
    Weiyu Tang
    Wu Shi
    Binyu Zang
    Chuanqi Zhu
    Journal of Computer Science and Technology, 1997, 12 (2) : 105 - 112
  • [39] From Loop Fusion to Kernel Fusion: A Domain-Specific Approach to Locality Optimization
    Qiao, Bo
    Reiche, Oliver
    Hannig, Frank
    Teich, Juergen
    PROCEEDINGS OF THE 2019 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO '19), 2019, : 242 - 253
  • [40] The Pluto plus Algorithm: A Practical Approach for Parallelization and Locality Optimization of Affine Loop Nests
    Bondhugula, Uday
    Acharya, Aravind
    Cohen, Albert
    ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2016, 38 (03):