Adaptation of Algorithms for efficient execution on GPUs

被引:0
|
作者
Bulavintsev, Vadim G. [1 ]
Zhdanov, Dmitry D. [2 ]
机构
[1] Delft Univ Technol, Delft, Netherlands
[2] ITMO Univ, St Petersburg, Russia
来源
OPTICAL DESIGN AND TESTING XI | 2021年 / 11895卷
关键词
GPU; SIMD; control flow graph; loop optimization; DPLL; resnet;
D O I
10.1117/12.2601619
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
We propose a generalized method for adapting and optimizing algorithms for efficient execution on modern graphics processing units (GPU). The method consists of several steps. First, build a control flow graph (CFG) of the algorithm. Next, transform the CFG into a tree of loops and merge non-parallelizable loops into parallelizable ones. Finally, map the resulting loops tree to the tree of GPU computational units, unrolling the algorithm's loops as necessary for the match. The method provides a convenient and robust mental framework and strategy for GPU code optimization. We demonstrate the method by adapting a backtracking search algorithm to the GPU platform and building an optimized implementation of the ResNeXt-50 neural network.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Enhancing an Embedded Processor Core for Efficient and Isolated Execution of Cryptographic Algorithms
    Yumbul, Kazim
    Savas, Erkay
    COMPUTER JOURNAL, 2015, 58 (10): : 2368 - 2387
  • [42] New heuristic algorithms for efficient execution of multiple groups of parallel processes
    Maksoud, E.Y. Abdel
    Journal of Engineering and Applied Science, 2000, 47 (06): : 981 - 1000
  • [43] Building a Lightweight Trusted Execution Environment for Arm GPUs
    Wang, Chenxu
    Deng, Yunjie
    Ning, Zhenyu
    Leach, Kevin
    Li, Jin
    Yan, Shoumeng
    He, Zhengyu
    Cao, Jiannong
    Zhang, Fengwei
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2024, 21 (04) : 3801 - 3816
  • [44] Scalar Waving: Improving the Efficiency of SIMD Execution on GPUs
    Yilmazer, Ayse
    Chen, Zhongliang
    Kaeli, David
    2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,
  • [45] Warp-Consolidation: A Novel Execution Model for GPUs
    Li, Ang
    Liu, Weifeng
    Wang, Linnan
    Barker, Kevin
    Song, Shuaiwen Leon
    INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS 2018), 2018, : 53 - 64
  • [46] G-Scalar: Cost-Effective Generalized Scalar Execution Architecture for Power-Efficient GPUs
    Liu, Zhenhong
    Gilani, Syed
    Annavaram, Murali
    Kim, Nam Sung
    2017 23RD IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2017, : 601 - 612
  • [47] Optimization of designing multiple genes encoding the same protein based on NSGA-II for efficient execution on GPUs
    Kim, Donghyeon
    Kim, Jinsung
    ELECTRONIC RESEARCH ARCHIVE, 2023, 31 (09): : 5313 - 5339
  • [48] Using GPUs to Accelerate CAD Algorithms
    Croix, John F.
    Gulati, Kanupriya
    Khatri, Sunil P.
    IEEE DESIGN & TEST, 2013, 30 (01) : 8 - 16
  • [49] Parallel Vertex Cover Algorithms on GPUs
    Yamout, Peter
    Barada, Karim
    Jaljuli, Adnan
    Mouawad, Amer E.
    El Hajj, Izzat
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 201 - 211
  • [50] Comparison of Modular Arithmetic Algorithms on GPUs
    Giorgi, Pascal
    Izard, Thomas
    Tisserand, Arnaud
    PARALLEL COMPUTING: FROM MULTICORES AND GPU'S TO PETASCALE, 2010, 19 : 315 - 322