Characterizing Massively Parallel Polymorphism

被引:1
|
作者
Zhang, Mengchi [1 ]
Alawneh, Ahmad [1 ]
Rogers, Timothy G. [1 ]
机构
[1] Purdue Univ, W Lafayette, IN 47907 USA
来源
2021 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2021) | 2021年
基金
美国国家科学基金会;
关键词
EFFICIENT; PERFORMANCE; LANGUAGE; CALLS;
D O I
10.1109/ISPASS51385.2021.00037
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
GPU computing has matured to include advanced C++ programming features. As a result, complex applications can potentially benefit from the continued performance improvements made to contemporary GPUs with each new generation. Tighter integration between the CPU and GPU, including a shared virtual memory space, increases the usability of productive programming paradigms traditionally reserved for CPUs, like object-oriented programming. Programmers are no longer forced to restructure both their code and data for GPU acceleration. However, the implementation and performance implications of advanced C++ on massively multithreaded accelerators have not been well studied. In this paper, we study the effects of runtime polymorphism on GPUs. We first detail the implementation of virtual function calls in contemporary GPUs using microbenchmarking. We then propose Parapoly, the first open-source polymorphic GPU benchmark suite. Using Parapoly, we further characterize the overhead caused by executing dynamic dispatch on GPUs using massively scaled CPU workloads. Our characterization demonstrates that the optimization space for runtime polymorphism on GPUs is fundamentally different than for CPUs. Where indirect branch prediction and ILP extraction strategies have dominated the work on CPU polymorphism, GPUs are fundamentally limited by excessive memory system contention caused by virtual function lookup and register spilling. Using the results of our study, we enumerate several pitfalls when writing polymorphic code for GPUs and suggest several new areas of system and architecture research that can help alleviate overhead.
引用
收藏
页码:205 / 216
页数:12
相关论文
共 50 条
  • [21] Massively parallel chaos for LiDAR
    Shaohua Yu
    Light: Science & Applications, 12
  • [22] On the Hardness of Massively Parallel Computation
    Chung, Kai-Min
    Ho, Kuan-Yi
    Sun, Xiaorui
    PROCEEDINGS OF THE 32ND ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES (SPAA '20), 2020, : 153 - 162
  • [23] Massively parallel multicanonical simulations
    Gross, Jonathan
    Zierenberg, Johannes
    Weigel, Martin
    Janke, Wolfhard
    COMPUTER PHYSICS COMMUNICATIONS, 2018, 224 : 387 - 395
  • [24] Clinical Massively Parallel Sequencing
    Gao, Ge
    Smith, David I.
    CLINICAL CHEMISTRY, 2020, 66 (01) : 77 - 88
  • [25] Massively parallel microfluidic pump
    Yairi, Micah
    Richter, Claudia
    SENSORS AND ACTUATORS A-PHYSICAL, 2007, 137 (02) : 350 - 356
  • [26] A Massively Parallel Architecture for Bioinformatics
    Pfeiffer, Gerd
    Baumgart, Stefan
    Schroeder, Jan
    Schimmler, Manfred
    COMPUTATIONAL SCIENCE - ICCS 2009, PART I, 2009, 5544 : 994 - +
  • [27] Massively Parallel Brain Imaging
    Miller, Greg
    Schnitzer, Mark
    SCIENCE, 2009, 326 (5951) : 390 - 390
  • [28] Massively parallel screening of the receptorome
    Jensen, Niels H.
    Roth, Bryan L.
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2008, 11 (06) : 420 - 426
  • [29] Massively parallel GOI test
    Ng, TK
    Lo, KF
    Bie, BB
    Andrew, Y
    2003 IEEE INTERNATIONAL INTEGRATED RELIABILITY WORKSHOP - FINAL REPORT, 2003, : 101 - 104
  • [30] Massively Parallel Video Networks
    Carreira, Joao
    Patraucean, Viorica
    Mazare, Laurent
    Zisserman, Andrew
    Osindero, Simon
    COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 : 680 - 697