Compiler-assisted Data Distribution for Chip Multiprocessors

被引:28
|
作者
Li, Yong [1 ]
Abousamra, Ahmed
Melhem, Rami
Jones, Alex K. [1 ]
机构
[1] Univ Pittsburgh, Dept ECE, Pittsburgh, PA 15261 USA
关键词
partitioning; data distribution; compiler-assisted caching; DATA LAYOUT;
D O I
10.1145/1854273.1854335
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data access latency, a limiting factor in the performance of chip multiprocessors; grows significantly with the number of cores in non-uniform cache architectures with distributed cache banks. To mitigate this effect, it is necessary to leverage the data access locality and choose an optimum data placement. Achieving this is especially challenging when other constraints such as cache capacity, coherence messages and runtime overhead need to be considered. This paper presents a compiler-based approach used for analyzing data access behavior in multi-threaded applications. The proposed experimental compiler framework employs novel compilation techniques to discover and represent multi-threaded memory access patterns (MMAPs). At run time, symbolic MMAPs are resolved and used by a partitioning algorithm to choose a partition of allocated memory blocks among the forked threads in the analyzed application. This partition is used to enforce data ownership by associating the data with the core that executes the thread owning the data. We demonstrate how this information can be used in an experimental architecture to accelerate applications. In particular, our compiler assisted approach shows a 20% speedup over shared caching and 5% speedup over the closest runtime approximation, "first touch".
引用
收藏
页码:501 / 512
页数:12
相关论文
共 50 条
  • [21] Reducing Context Switch Overhead with Compiler-Assisted Threading
    Jaaskelainen, Pekka
    Kellomaki, Pertti
    Takala, Jarmo
    Kultala, Heikki
    Lepisto, Mikael
    EUC 2008: PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING, VOL 2, WORKSHOPS, 2008, : 461 - 466
  • [22] CARE: Compiler-Assisted Recovery from Soft Failures
    Chen, Chao
    Eisenhauer, Greg
    Pande, Santosh
    Guan, Qiang
    PROCEEDINGS OF SC19: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2019,
  • [23] Compiler-Assisted Loop Hardening Against Fault Attacks
    Proy, Julien
    Heydemann, Karine
    Berzati, Alexandre
    Cohen, Albert
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2017, 14 (04)
  • [24] Compiler-assisted power optimization for clustered VLIW architectures
    Nagpal, Rahul
    Srikant, Y. N.
    PARALLEL COMPUTING, 2011, 37 (01) : 42 - 59
  • [25] Compiler-assisted energy optimization for clustered VLIW processors
    Nagpal, Rahul
    Srikant, Y. N.
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2012, 72 (08) : 944 - 959
  • [26] Compiler-Assisted Test Acceleration on GPUs for Embedded Software
    Yaneva, Vanya
    Rajan, Ajitha
    Dubach, Christophe
    PROCEEDINGS OF THE 26TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS (ISSTA'17), 2017, : 35 - 45
  • [27] Compiler-Assisted Scheduling for Multi-Instance GPUs
    Porter, Chris
    Chen, Chao
    Pande, Santosh
    14TH WORKSHOP ON GENERAL PURPOSE PROCESSING USING GPU (GPGPU 2022), 2022, : 19 - 24
  • [28] Compiler-Assisted Overlapping of Communication and Computation in MPI Applications
    Guo, Jichi
    Yi, Qing
    Meng, Jiayuan
    Zhang, Junchao
    Balaji, Pavan
    2016 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2016, : 60 - 69
  • [29] A COMPILER-ASSISTED SCHEME FOR ADAPTIVE CACHE COHERENCE ENFORCEMENT
    NGUYEN, TN
    MOUNESTOUSSI, F
    LILJA, DJ
    LI, ZY
    PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 1994, 50 : 69 - 78
  • [30] Compiler-Assisted Selection of a Software Transactional Memory System
    Schindewolf, Martin
    Esselson, Alexander
    Karl, Wolfgang
    ARCHITECTURE OF COMPUTING SYSTEMS - ARCS 2011, 2011, 6566 : 147 - 157