Adaptation of Algorithms for efficient execution on GPUs

被引：0

作者：

Bulavintsev, Vadim G. ^{[1
]}

Zhdanov, Dmitry D. ^{[2
]}

机构：

[1] Delft Univ Technol, Delft, Netherlands

[2] ITMO Univ, St Petersburg, Russia

来源：

OPTICAL DESIGN AND TESTING XI | 2021年 / 11895卷

关键词：

GPU; SIMD; control flow graph; loop optimization; DPLL; resnet;

D O I：

10.1117/12.2601619

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

We propose a generalized method for adapting and optimizing algorithms for efficient execution on modern graphics processing units (GPU). The method consists of several steps. First, build a control flow graph (CFG) of the algorithm. Next, transform the CFG into a tree of loops and merge non-parallelizable loops into parallelizable ones. Finally, map the resulting loops tree to the tree of GPU computational units, unrolling the algorithm's loops as necessary for the match. The method provides a convenient and robust mental framework and strategy for GPU code optimization. We demonstrate the method by adapting a backtracking search algorithm to the GPU platform and building an optimized implementation of the ResNeXt-50 neural network.

引用

页数：8

共 50 条

[41] Enhancing an Embedded Processor Core for Efficient and Isolated Execution of Cryptographic Algorithms
Yumbul, Kazim
Savas, Erkay
COMPUTER JOURNAL, 2015, 58 (10): : 2368 - 2387
[42] New heuristic algorithms for efficient execution of multiple groups of parallel processes
Maksoud, E.Y. Abdel
Journal of Engineering and Applied Science, 2000, 47 (06): : 981 - 1000
[43] Building a Lightweight Trusted Execution Environment for Arm GPUs
Wang, Chenxu
Deng, Yunjie
Ning, Zhenyu
Leach, Kevin
Li, Jin
Yan, Shoumeng
He, Zhengyu
Cao, Jiannong
Zhang, Fengwei
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2024, 21 (04) : 3801 - 3816
[44] Scalar Waving: Improving the Efficiency of SIMD Execution on GPUs
Yilmazer, Ayse
Chen, Zhongliang
Kaeli, David
2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,
[45] Warp-Consolidation: A Novel Execution Model for GPUs
Li, Ang
Liu, Weifeng
Wang, Linnan
Barker, Kevin
Song, Shuaiwen Leon
INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS 2018), 2018, : 53 - 64
[46] G-Scalar: Cost-Effective Generalized Scalar Execution Architecture for Power-Efficient GPUs
Liu, Zhenhong
Gilani, Syed
Annavaram, Murali
Kim, Nam Sung
2017 23RD IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2017, : 601 - 612
[47] Optimization of designing multiple genes encoding the same protein based on NSGA-II for efficient execution on GPUs
Kim, Donghyeon
Kim, Jinsung
ELECTRONIC RESEARCH ARCHIVE, 2023, 31 (09): : 5313 - 5339
[48] Using GPUs to Accelerate CAD Algorithms
Croix, John F.
Gulati, Kanupriya
Khatri, Sunil P.
IEEE DESIGN & TEST, 2013, 30 (01) : 8 - 16
[49] Parallel Vertex Cover Algorithms on GPUs
Yamout, Peter
Barada, Karim
Jaljuli, Adnan
Mouawad, Amer E.
El Hajj, Izzat
2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 201 - 211
[50] Comparison of Modular Arithmetic Algorithms on GPUs
Giorgi, Pascal
Izard, Thomas
Tisserand, Arnaud
PARALLEL COMPUTING: FROM MULTICORES AND GPU'S TO PETASCALE, 2010, 19 : 315 - 322

← 1 2 3 4 5 →