On the effective implementation of a boundary element code on graphics processing units using an out-of-core LU algorithm

被引:5
|
作者
D'Azevedo, E. F. [1 ]
Fata, S. Nintcheu [1 ]
机构
[1] Oak Ridge Natl Lab, Comp Sci & Math Div, Oak Ridge, TN 37831 USA
关键词
Collocation approximation; Boundary element method; Triangulated boundary; Graphics processor; FACTORIZATION;
D O I
10.1016/j.enganabound.2012.02.014
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
A collocation boundary element code for solving the three-dimensional Laplace equation, publicly available from http://intetec.org, has been adapted to run on an Nvidia Tesla general-purpose graphics processing unit (CPU). Global matrix assembly and LU factorization of the resulting dense matrix are performed on the CPU. Out-of-core techniques are used to solve problems larger than the available CPU memory. The code achieved about 10 times speedup in matrix assembly over a single CPU core and about 56 Gflops/s in the LU factorization using only 512 Mbytes of GPU memory. Details of the CPU implementation and comparisons with the standard sequential algorithm are included to illustrate the performance of the CPU code. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1246 / 1255
页数:10
相关论文
共 50 条
  • [41] Real-time multitarget tracking for sensor-based sortingA new implementation of the auction algorithm for graphics processing units
    Georg Maier
    Florian Pfaff
    Matthias Wagner
    Christoph Pieper
    Robin Gruna
    Benjamin Noack
    Harald Kruggel-Emden
    Thomas Längle
    Uwe D. Hanebeck
    Siegmar Wirtz
    Viktor Scherer
    Jürgen Beyerer
    Journal of Real-Time Image Processing, 2019, 16 : 2261 - 2272
  • [42] Real-time multitarget tracking for sensor-based sorting A new implementation of the auction algorithm for graphics processing units
    Maier, Georg
    Pfaff, Florian
    Wagner, Matthias
    Pieper, Christoph
    Gruna, Robin
    Noack, Benjamin
    Kruggel-Emden, Harald
    Laengle, Thomas
    Hanebeck, Uwe D.
    Wirtz, Siegmar
    Scherer, Viktor
    Beyerer, Juergen
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2019, 16 (06) : 2261 - 2272
  • [43] Exploiting parallel graphics processing units to improve association rule mining in transactional databases using butterfly optimization algorithm
    Ali Abbas Zoraghchian
    Mohammad Karim Sohrabi
    Farzin Yaghmaee
    Cluster Computing, 2021, 24 : 3767 - 3778
  • [44] Exploiting parallel graphics processing units to improve association rule mining in transactional databases using butterfly optimization algorithm
    Zoraghchian, Ali Abbas
    Sohrabi, Mohammad Karim
    Yaghmaee, Farzin
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2021, 24 (04): : 3767 - 3778
  • [45] An efficient parallel block coordinate descent algorithm for large-scale precision matrix estimation using graphics processing units
    Choi, Young-Geun
    Lee, Seunghwan
    Yu, Donghyeon
    COMPUTATIONAL STATISTICS, 2022, 37 (01) : 419 - 443
  • [46] An efficient parallel block coordinate descent algorithm for large-scale precision matrix estimation using graphics processing units
    Young-Geun Choi
    Seunghwan Lee
    Donghyeon Yu
    Computational Statistics, 2022, 37 : 419 - 443
  • [47] JAWAMix5: an out-of-core HDF5-based java']java implementation of whole-genome association studies using mixed models
    Long, Quan
    Zhang, Qingrun
    Vilhjalmsson, Bjarni J.
    Forai, Petar
    Seren, Uemit
    Nordborg, Magnus
    BIOINFORMATICS, 2013, 29 (09) : 1220 - 1222
  • [48] Fast period searches using the Lomb-Scargle algorithm on Graphics Processing Units for large datasets and real-time applications
    Gowanlock, M.
    Kramer, D.
    Trilling, D. E.
    Butler, N. R.
    Donnelly, B.
    ASTRONOMY AND COMPUTING, 2021, 36
  • [49] Parallelized Monte-Carlo dosimetry using graphics processing units to model cylindrical diffusers used in photodynamic therapy: From implementation to validation
    Dupont, Clement
    Baert, Gregory
    Mordon, Serge
    Vermandel, Maximilien
    PHOTODIAGNOSIS AND PHOTODYNAMIC THERAPY, 2019, 26 : 351 - 360
  • [50] Real time implementation of anti-scatter grid artifact elimination method for high resolution x-ray imaging CMOS detectors using Graphics Processing Units (GPUs)
    Rana, R.
    Nagesh, S. V. Setlur
    Bednarek, D. R.
    Rudin, S.
    MEDICAL IMAGING 2017: PHYSICS OF MEDICAL IMAGING, 2017, 10132