Fully parallel and pipelined sparse direct solver for large symmetric indefinite finite element problems

被引:0
|
作者
Wang, Yujie [1 ]
Wang, Shengquan [1 ]
Cai, Yong [1 ]
Wang, Guidong [1 ]
Li, Guangyao [2 ]
机构
[1] Hunan Univ, State Key Lab Adv Design & Mfg Technol Vehicle, Changsha 410082, Peoples R China
[2] Beijing Inst Technol, Shenzhen Automot Res Inst, Shenzhen 518118, Guangdong, Peoples R China
关键词
Sparse direct solver; High performance computing; FEM; CHOLESKY FACTORIZATION; ALGORITHM;
D O I
10.1016/j.camwa.2024.10.017
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Sparse linear system solving is a primary computational cost in large-scale finite element analysis, and improving its performance is a key technological challenge in this field. Real-world engineering problems involve diverse materials, elements, and connectivity relationships, making it difficult for iterative methods to handle their global stiffness matrices. Direct methods, owing to their robustness, emerge as the preferred choice. In this paper, a novel block-based supernodal LDLT numerical factorization method is introduced. The computational process is disassembled into distinct tasks, and the dependency relationships between these tasks are expressed via a directed acyclic graph to guide the calculation sequence. Based on this approach, a global task pool and local task stack are established to store task queues, enhancing data reuse and multicore collaboration efficiency. Additionally, an effective task dispatch and work-stealing mechanism is implemented to prevent performance degradation caused by load imbalances. Numerical experiments, including a publicly available matrix test set and real-world engineering finite element problems, are conducted to compare the parallel performances of the Pardiso, MUMPS, and proposed solver. The results illustrate that the proposed solver performs significantly better than the other solvers when handling various types of sparse matrices and diverse architectures of multicore processors.
引用
收藏
页码:447 / 469
页数:23
相关论文
共 50 条
  • [21] A Parallel Direct Solver for a Hierarchical H-Adaptive Finite Element Code
    Rodenas, J. J.
    Corral, C.
    Mas, J.
    Olmeda, F.
    Albelda, J.
    PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE ON ENGINEERING COMPUTATIONAL TECHNOLOGY, 2010, 94
  • [22] PASTIX:: a high-performance parallel direct solver for sparse symmetric positive definite systems
    Hénon, P
    Ramet, P
    Roman, J
    PARALLEL COMPUTING, 2002, 28 (02) : 301 - 321
  • [23] Interior point method and indefinite sparse solver for linear programming problems
    Runesha, H
    Nguyen, DT
    Belegundu, AD
    Chandrupatla, TR
    ADVANCES IN ENGINEERING SOFTWARE, 1998, 29 (3-6) : 409 - 414
  • [24] Deployment of parallel direct sparse linear solvers within a parallel finite element code
    Klimowicz, ANF
    Mihajlovic, MD
    Heil, M
    PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING AND NETWORKS, 2006, : 310 - +
  • [25] An efficient direct parallel spectral-element solver for separable elliptic problems
    Kwan, Yuen-Yick
    Shen, Jie
    JOURNAL OF COMPUTATIONAL PHYSICS, 2007, 225 (02) : 1721 - 1735
  • [26] A parallel frontal solver for finite element applications
    Scott, JA
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 2001, 50 (05) : 1131 - 1144
  • [27] A parallel solver for adaptive finite element discretizations
    Marcuzzi, F
    Cecchi, MM
    NUMERICAL ALGORITHMS, 2005, 40 (03) : 217 - 231
  • [28] A parallel hierarchical solver for finite element applications
    Thole, CA
    Supalov, A
    Mayer, S
    APPLIED PARALLEL COMPUTING: LARGE SCALE SCIENTIFIC AND INDUSTRIAL PROBLEMS, 1998, 1541 : 557 - 564
  • [29] A parallel solver for adaptive finite element discretizations
    F. Marcuzzi
    M. Morandi Cecchi
    Numerical Algorithms, 2005, 40 : 217 - 231
  • [30] A parallel distributed solver for large dense symmetric systems: Applications to geodesy and electromagnetism problems
    Baboullin, M
    Giraud, L
    Gratton, S
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2005, 19 (04): : 353 - 363