DyCache: Dynamic Multi-Grain Cache Management for Irregular Memory Accesses on GPU

被引:6
|
作者
Guo, Hui [1 ]
Huang, Libo [2 ]
Lu, Yashuai [4 ]
Ma, Sheng [2 ]
Wang, Zhiying [3 ]
机构
[1] Natl Univ Def Technol, Changsha 410073, Hunan, Peoples R China
[2] Natl Univ Def Technol, Sch Comp, Changsha 410073, Hunan, Peoples R China
[3] Natl Univ Def Technol, Comp Engn, Dept Comp, Changsha 410073, Hunan, Peoples R China
[4] Space Engn Univ, Beijing 101416, Peoples R China
来源
IEEE ACCESS | 2018年 / 6卷
关键词
Accelerator architectures; cache memory; fine-grain cache management; GPGPU computing; irregular memory access; memory divergence; memory management;
D O I
10.1109/ACCESS.2018.2818193
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
GPU utilizes the wide cache-line (128B) on-chip cache to provide high bandwidth and efficient memory accesses for applications with regularly-organized data structures. However, emerging applications exhibit a lot of irregular control flows and memory access patterns. Irregular memory accesses generate many fine-grain memory accesses to L1 data cache. This mismatching between fine-grain data accesses and the coarse-grain cache design makes the on-chip memory space more constrained and as a result, the frequency of cache line replacement increases and Ll data cache is utilized inefficiently. Fine-grain cache management is proposed to provide efficient cache management to improve the efficiency of data array utilization. Unlike other static fine-grain cache managements, we propose a dynamic multi-grain cache management, called DyCache, to resolve the inefficient use of L1 data cache. Through monitoring the memory access pattern of applications, DyCache can dynamically alter the cache management granularity in order to improve the performance of GPU for applications with irregular memory accesses while not impact the performance for regular applications. Our experiment demonstrates that DyCache can achieve a 40% geometric mean improvement on IPC for applications with irregular memory accesses against the baseline cache (128B), while for applications with regular memory accesses, DyCache does not degrade the performance.
引用
收藏
页码:38881 / 38891
页数:11
相关论文
共 37 条
  • [1] Multi-grain remote access cache in NUMA system
    Kwak, JW
    Kim, CH
    Jhang, ST
    Jhon, CS
    SEVENTH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND GRID IN ASIA PACIFIC REGION, PROCEEDINGS, 2004, : 178 - 185
  • [2] Using Criticality of GPU Accesses in Memory Management for CPU-GPU Heterogeneous Multi-Core Processors
    Rai, Siddharth
    Chaudhuri, Mainak
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2017, 16
  • [3] Accelerating Deformable Convolution Networks with Dynamic and Irregular Memory Accesses
    Chu, Cheng
    Liu, Cheng
    Xu, Dawen
    Wang, Ying
    Luo, Tao
    Li, Huawei
    Li, Xiaowei
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (04)
  • [4] An Efficient GPU Cache Architecture for Applications with Irregular Memory Access Patterns
    Li, Bingchao
    Wei, Jizeng
    Sun, Jizhou
    Annavaram, Murali
    Kim, Nam Sung
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2019, 16 (03) : 1 - 24
  • [5] Adaptive multi-grain remote access cache in ring based NUMA system
    Kwak, Jong Wook
    Jhon, Chu Shik
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2006, 22 (06) : 1543 - 1554
  • [6] Adaptive multi-grain remote access cache in ring based NUMA system
    Department of Electrical Engineering and Computer Science, Seoul National University, Seoul, 151-742, Korea, Republic of
    J. Inf. Sci. Eng., 2006, 6 (1543-1554):
  • [7] Efficient Management of Cache Accesses to Boost GPGPU Memory Subsystem Performance
    Candel, Francisco
    Valero, Alejandro
    Petit, Salvador
    Sahuquillo, Julio
    IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (10) : 1442 - 1454
  • [8] Optimizing Hyperplane Sweep Operations Using Asynchronous Multi-grain GPU Tasks
    Kaushik, Anirudh Mohan
    Aji, Ashwin M.
    Hassaan, Muhammad Amber
    Chalmers, Noel
    Wolfe, Noah
    Moe, Scott
    Puthoor, Sooraj
    Beckmann, Bradford M.
    PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC 2019), 2019, : 59 - 69
  • [9] DaCache: Memory Divergence-Aware GPU Cache Management
    Wang, Bin
    Yu, Weikuan
    Sun, Xian-He
    Wang, Xinning
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 89 - 98
  • [10] Molecular dynamics simulation of shape memory behaviour using a multi-grain model
    Uehara, T.
    Asai, C.
    Ohno, N.
    MODELLING AND SIMULATION IN MATERIALS SCIENCE AND ENGINEERING, 2009, 17 (03)