GPU parallel implementation of a finite volume lattice Boltzmann method for incompressible flows

被引:2
|
作者
Wen, Mengke [1 ,2 ]
Shen, Siyuan [3 ]
Li, Weidong [1 ,2 ]
机构
[1] China Aerodynam Res & Dev Ctr, Hyperveloc Aerodynam Inst, Mianyang 621000, Peoples R China
[2] Natl Key Lab Aerosp Phys Fluids, Mianyang 621000, Peoples R China
[3] Wuhan Univ Technol, Sch Automat, Wuhan 430070, Peoples R China
关键词
GPU parallel; Finite volume lattice Boltzmann method; Unstructured mesh; Incompressible flows; CIRCULAR-CYLINDER; SIMULATION; MODEL;
D O I
10.1016/j.compfluid.2024.106460
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This work presents a graphics processing units (GPU) parallel algorithm of a cell-centered finite volume lattice Boltzmann method (FVLBM) on unstructured meshes. In the present GPU parallel algorithm, the parallelization is performed in the physical space. To reduce the frequency of GPU memory accesses, this algorithm develops coalesced access to GPU memory. In addition, to avoid the race for resources leading to data anomalies, such as dirty read or phantom read etc., and the double counting for flux calculation, the efficient face-based data structure often used for flux calculation in cells in the central processing unit (CPU) version of FVLBM is modified into a face-based data structure used for the fluxes on all faces, followed by a cell-based loop for the final residuals in all cells. Therefore, the proposed GPU parallel algorithm does not need to use the resource lock and retains the high efficiency of the face-based data structure in the fluxes computation to enhance its' parallel efficiency. Additionally, to demonstrate the computational efficiency of the proposed GPU parallel algorithm, various benchmark studies are performed in this work by the proposed parallel scheme on a double precision NVIDIA GeForce RTX 3090Ti GPU card, including (a) the lid-driven flow in a two-dimensional (2D) square cavity, (b) a 2D flow past a cylinder, and (c) the lid-driven flow in a three-dimensional (3D) cubic cavity. The numerical results show that the proposed GPU parallel algorithm can be as accurate as the original CPU serial scheme with 1 to 2 orders of speedup.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Hybrid finite volume WENO and lattice Boltzmann method for shallow flows over erodible bed
    Jung, Jaeyoung
    Hwang, Jin Hwan
    Ocean Engineering, 2022, 265
  • [42] MODELING OF BIFURCATION PHENOMENA IN SUDDENLY EXPANDED FLOWS WITH A NEW FINITE VOLUME LATTICE BOLTZMANN METHOD
    Zarghami, Ahad
    Maghrebi, Mohammad Javad
    Ubertini, Stefano
    Succi, Sauro
    INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2011, 22 (09): : 977 - 1003
  • [43] Hybrid finite volume WENO and lattice Boltzmann method for shallow flows over erodible bed
    Jung, Jaeyoung
    Hwang, Jin Hwan
    OCEAN ENGINEERING, 2022, 265
  • [44] Parallel in time approximation of the lattice Boltzmann method for laminar flows
    Randles, Amanda
    Kaxiras, Efthimios
    JOURNAL OF COMPUTATIONAL PHYSICS, 2014, 270 : 577 - 586
  • [45] Parallel finite-volume discrete Boltzmann method for inviscid compressible flows on unstructured grids
    Xu, Lei
    Chen, Rongliang
    Cai, Xiao-Chuan
    PHYSICAL REVIEW E, 2021, 103 (02)
  • [46] Sailfish: A flexible multi-GPU implementation of the lattice Boltzmann method
    Januszewski, M.
    Kostur, M.
    COMPUTER PHYSICS COMMUNICATIONS, 2014, 185 (09) : 2350 - 2368
  • [47] Implementation and optimization of lattice Boltzmann method for fluid flow on GPU with CUDA
    Qin, Zhangrong
    Liu, Haiyan
    Mo, Liuliu
    Li, Yuanyuan
    International Journal of Digital Content Technology and its Applications, 2012, 6 (13) : 30 - 37
  • [48] High-order upwind compact finite-difference lattice Boltzmann method for viscous incompressible flows
    Sun, Y. X.
    Tian, Z. F.
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2020, 80 (07) : 1858 - 1872
  • [49] A parallel finite volume method for aerodynamic flows
    Weatherill, N
    Sorensen, K
    Hassan, O
    Morgan, K
    COMPUTATIONAL SCIENCE-ICCS 2002, PT II, PROCEEDINGS, 2002, 2330 : 816 - 823
  • [50] The TheLMA project: Multi-GPU implementation of the lattice Boltzmann method
    Obrecht, Christian
    Kuznik, Frederic
    Tourancheau, Bernard
    Roux, Jean-Jacques
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2011, 25 (03): : 295 - 303