Characterizing Massively Parallel Polymorphism

被引：1

作者：

Zhang, Mengchi ^{[1
]}

Alawneh, Ahmad ^{[1
]}

Rogers, Timothy G. ^{[1
]}

机构：

[1] Purdue Univ, W Lafayette, IN 47907 USA

来源：

2021 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2021) | 2021年

基金：

美国国家科学基金会;

关键词：

EFFICIENT; PERFORMANCE; LANGUAGE; CALLS;

D O I：

10.1109/ISPASS51385.2021.00037

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

GPU computing has matured to include advanced C++ programming features. As a result, complex applications can potentially benefit from the continued performance improvements made to contemporary GPUs with each new generation. Tighter integration between the CPU and GPU, including a shared virtual memory space, increases the usability of productive programming paradigms traditionally reserved for CPUs, like object-oriented programming. Programmers are no longer forced to restructure both their code and data for GPU acceleration. However, the implementation and performance implications of advanced C++ on massively multithreaded accelerators have not been well studied. In this paper, we study the effects of runtime polymorphism on GPUs. We first detail the implementation of virtual function calls in contemporary GPUs using microbenchmarking. We then propose Parapoly, the first open-source polymorphic GPU benchmark suite. Using Parapoly, we further characterize the overhead caused by executing dynamic dispatch on GPUs using massively scaled CPU workloads. Our characterization demonstrates that the optimization space for runtime polymorphism on GPUs is fundamentally different than for CPUs. Where indirect branch prediction and ILP extraction strategies have dominated the work on CPU polymorphism, GPUs are fundamentally limited by excessive memory system contention caused by virtual function lookup and register spilling. Using the results of our study, we enumerate several pitfalls when writing polymorphic code for GPUs and suggest several new areas of system and architecture research that can help alleviate overhead.

引用

页码：205 / 216

页数：12

共 50 条

[21] Massively parallel chaos for LiDAR
Shaohua Yu
Light: Science & Applications, 12
[22] On the Hardness of Massively Parallel Computation
Chung, Kai-Min
Ho, Kuan-Yi
Sun, Xiaorui
PROCEEDINGS OF THE 32ND ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES (SPAA '20), 2020, : 153 - 162
[23] Massively parallel multicanonical simulations
Gross, Jonathan
Zierenberg, Johannes
Weigel, Martin
Janke, Wolfhard
COMPUTER PHYSICS COMMUNICATIONS, 2018, 224 : 387 - 395
[24] Clinical Massively Parallel Sequencing
Gao, Ge
Smith, David I.
CLINICAL CHEMISTRY, 2020, 66 (01) : 77 - 88
[25] Massively parallel microfluidic pump
Yairi, Micah
Richter, Claudia
SENSORS AND ACTUATORS A-PHYSICAL, 2007, 137 (02) : 350 - 356
[26] A Massively Parallel Architecture for Bioinformatics
Pfeiffer, Gerd
Baumgart, Stefan
Schroeder, Jan
Schimmler, Manfred
COMPUTATIONAL SCIENCE - ICCS 2009, PART I, 2009, 5544 : 994 - +
[27] Massively Parallel Brain Imaging
Miller, Greg
Schnitzer, Mark
SCIENCE, 2009, 326 (5951) : 390 - 390
[28] Massively parallel screening of the receptorome
Jensen, Niels H.
Roth, Bryan L.
COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2008, 11 (06) : 420 - 426
[29] Massively parallel GOI test
Ng, TK
Lo, KF
Bie, BB
Andrew, Y
2003 IEEE INTERNATIONAL INTEGRATED RELIABILITY WORKSHOP - FINAL REPORT, 2003, : 101 - 104
[30] Massively Parallel Video Networks
Carreira, Joao
Patraucean, Viorica
Mazare, Laurent
Zisserman, Andrew
Osindero, Simon
COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 : 680 - 697

← 1 2 3 4 5 →