Multithreaded Pipeline Synthesis for Data-Parallel Kernels

被引：0

作者：

Tan, Mingxing ^{[1
]}

Liu, Bin ^{[2
]}

Dai, Steve ^{[1
]}

Zhang, Zhiru ^{[1
]}

机构：

[1] Cornell Univ, Sch Elect & Comp Engn, Ithaca, NY 14850 USA

[2] Facebook Inc, Menlo Pk, CA USA

来源：

2014 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD) | 2014年

关键词：

HIGH-LEVEL SYNTHESIS;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Pipelining is an important technique in high-level synthesis, which overlaps the execution of successive loop iterations or threads to achieve high throughput for loop/function kernels. Since existing pipelining techniques typically enforce in-order thread execution, a variable-latency operation in one thread would block all subsequent threads, resulting in considerable performance degradation. In this paper, we propose a multithreaded pipelining approach that enables context switching to allow out-of-order thread execution for data-parallel kernels. To ensure that the synthesized pipeline is complexity effective, we further propose efficient scheduling algorithms for minimizing the hardware overhead associated with context management. Experimental results show that our proposed techniques can significantly improve the effective pipeline throughput over conventional approaches while conserving hardware resources.

引用

页码：718 / 725

页数：8

共 50 条

[1] A comparison of implicitly parallel multithreaded and data-parallel implementations of an ocean model
Shaw, A
Arvind
Cho, KC
Hill, C
Johnson, RP
Marshall, J
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1998, 48 (01) : 1 - 51
[2] A Comparison of Implicitly Parallel Multithreaded and Data-Parallel Implementations of an Ocean Model
Shaw, A.
Arvind
Cho, K.-C.
Hill, C.
Journal of Parallel and Distributed Computing, 48 (01):
[3] Orchestrating Multiple Data-Parallel Kernels on Multiple Devices
Lee, Janghaeng
Samadi, Mehrzad
Mahlke, Scott
2015 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION (PACT), 2015, : 355 - 366
[4] A multithreaded runtime environment with thread migration for a HPF data-parallel compiler
Bouge, L
Hatcher, P
Namyst, R
Perez, C
1998 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS, 1998, : 418 - 425
[5] Automated Partitioning of Data-Parallel Kernels using Polyhedral Compilation
Matz, Alexander
Doerfert, Johannes
Froening, Holger
49TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOP PROCEEDINGS, ICPP 2020, 2020,
[6] SAPipe: Staleness-Aware Pipeline for Data-Parallel DNN Training
Chen, Yangrui
Xie, Cong
Ma, Meng
Gu, Juncheng
Peng, Yanghua
Lin, Haibin
Wu, Chuan
Zhu, Yibo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[7] Transparent CPU-GPU Collaboration for Data-Parallel Kernels on Heterogeneous Systems
Lee, Janghaeng
Samadi, Mehrzad
Park, Yongjun
Mahlke, Scott
2013 22ND INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2013, : 245 - 255
[8] Data-parallel polygonization
Hoel, EG
Samet, H
PARALLEL COMPUTING, 2003, 29 (10) : 1381 - 1401
[9] FPGA Circuit Synthesis of Accelerator Data-Parallel Programs
Bond, Barry
Hammil, Kerry
Litchev, Lubomir
Singh, Satnam
2010 18TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2010), 2010, : 167 - 170
[10] Data-parallel computing
Boyd, Chas.
2008, Association for Computing Machinery, New York, NY 10036-5701, United States (06):

← 1 2 3 4 5 →