Selective Caching: Avoiding Performance Valleys in Massively Parallel Architectures

被引:0
|
作者
Jadidi, Amin [1 ]
Kandemir, Mahmut T. [2 ]
Das, Chita R. [2 ]
机构
[1] Cadence Design Syst, San Jose, CA 95134 USA
[2] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
来源
2020 28TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2020) | 2020年
关键词
D O I
10.1109/PDP50117.2020.00051
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Emerging general purpose graphics processing units (GPGPU) make use of a memory hierarchy very similar to that of modern multi-core processors they typically have multiple levels of on-chip caches and a DDR-like off-chip main memory. In such massively parallel architectures, caches are expected to reduce the average data access latency by reducing the number of off-chip memory accesses; however, our extensive experimental studies confirm that not all applications utilize the on-chip caches in an efficient manner. Even though GPGPUs are adopted to run a wide range of general purpose applications, the conventional cache management policies are incapable of achieving the optimal performance over different memory characteristics of the applications. This paper first investigates the underlying reasons for inefficiency of common cache management policies in GPGPUs. To address and resolve those issues, we then propose (i) a characterization mechanism to analyze each kernel at runtime and, (ii) a selective caching policy to manage the flow of cache accesses. Evaluation results of the studied platform show that our proposed dynamically reconfigurable cache hierarchy improves the system performance by up to 105% (average of 27%) over a wide range of modern GPGPU applications, which is within 10% of the optimal improvement.
引用
收藏
页码:290 / 298
页数:9
相关论文
共 50 条
  • [21] TELEMAC: An efficient hydrodynamics suite for massively parallel architectures
    Moulinec, C.
    Denis, C.
    Pham, C. -T.
    Rouge, D.
    Hervouet, J. -M.
    Razafindrakoto, E.
    Barber, R. W.
    Emerson, D. R.
    Gu, X. -J.
    COMPUTERS & FLUIDS, 2011, 51 (01) : 30 - 34
  • [22] Performance portability study for massively parallel computational fluid dynamics application on scalable heterogeneous architectures
    Lee, Seyong
    Gounley, John
    Randles, Amanda
    Vetter, Jeffrey S.
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2019, 129 : 1 - 13
  • [23] k-ary n-trees: High performance networks for massively parallel architectures
    Petrini, F
    Vanneschi, M
    11TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM, PROCEEDINGS, 1997, : 87 - 93
  • [24] Massively Parallel Skyline Computation For Processing-In-Memory Architectures
    Zois, Vasileios
    Gupta, Divya
    Tsotras, Vassilis J.
    Najjar, Walid A.
    Roy, Jean-Francois
    27TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT 2018), 2018,
  • [25] The challenges of efficient code-generation for massively parallel architectures
    McGuiness, Jason M.
    Egan, Colin
    Christianson, Bruce
    Gao, Guang
    ADVANCES IN COMPUTER SYSTEMS ARCHITECTURE, PROCEEDINGS, 2006, 4186 : 416 - 422
  • [26] Occam-pi for Programming of Massively Parallel Reconfigurable Architectures
    Zain-Ul-Abdin
    Svensson, Bertil
    INTERNATIONAL JOURNAL OF RECONFIGURABLE COMPUTING, 2012, 2012
  • [27] ADAPTIVE ROUTING FOR DYNAMIC APPLICATIONS IN MASSIVELY-PARALLEL ARCHITECTURES
    BOARI, M
    CORRADI, A
    STEFANELLI, C
    LEONARDI, L
    IEEE PARALLEL & DISTRIBUTED TECHNOLOGY, 1995, 3 (01): : 61 - 74
  • [28] DATA-STRUCTURES FOR NETWORK ALGORITHMS ON MASSIVELY PARALLEL ARCHITECTURES
    NIELSEN, SS
    ZENIOS, SA
    PARALLEL COMPUTING, 1992, 18 (09) : 1033 - 1052
  • [29] Operating system support for massively parallel computer architectures: an introduction
    1600, John Wiley & Sons Inc, New York, NY, USA (03):
  • [30] A Massively Parallel Adaptive Fast Multipole Method on Heterogeneous Architectures
    Lashuk, Ilya
    Chandramowlishwaran, Aparna
    Langston, Harper
    Tuan-Anh Nguyen
    Sampath, Rahul
    Shringarpure, Aashay
    Vuduc, Richard
    Ying, Lexing
    Zorin, Denis
    Biros, George
    COMMUNICATIONS OF THE ACM, 2012, 55 (05) : 101 - 109