Scalable parallel linear solver for compact banded systems on heterogeneous architectures

被引：1

作者：

Song, Hang ^{[1
]}

Matsuno, Kristen V. ^{[1
]}

West, Jacob R. ^{[1
]}

Subramaniam, Akshay ^{[2
]}

Ghate, Aditya S. ^{[2
]}

Lele, Sanjiva K. ^{[1
,2
]}

机构：

[1] Stanford Univ, Dept Mech Engn, Stanford, CA 94305 USA

[2] Stanford Univ, Dept Aeronaut & Astronaut, Stanford, CA 94305 USA

来源：

JOURNAL OF COMPUTATIONAL PHYSICS | 2022年 / 468卷

基金：

美国国家科学基金会;

关键词：

Compact banded system; Periodic boundary; Parallel cyclic reduction; Distributed memory; Parallel computing; BLOCK TRIDIAGONAL SYSTEMS; LARGE-EDDY SIMULATION; CYCLIC REDUCTION; DIFFERENCE SCHEMES; FLOW;

D O I：

10.1016/j.jcp.2022.111443

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

A scalable algorithm for solving compact banded linear systems on distributed memory architectures is presented. The proposed method factorizes the original system into two levels of memory hierarchies, and solves it using parallel cyclic reduction on both distributed and shared memory. This method has a lower communication footprint across distributed memory partitions compared to conventional algorithms involving data transposes or re-partitioning. The algorithm developed in this work is generalized to cyclic compact banded systems with flexible data decompositions. For cyclic compact banded systems, the method is a direct solver with a deterministic operation and communication counts depending on the matrix size, its bandwidth, and the partition strategy. The implementation and runtime configuration details are discussed for performance opti-mization. Scalability is demonstrated on the linear solver as well as on a representative fluid mechanics application problem, in which the dominant computational cost is solving the cyclic tridiagonal linear systems of compact numerical schemes on a 3D periodic domain. The algorithm is particularly useful for solving the linear systems arising from the application of compact finite difference operators to a wide range of partial differential equation problems, such as but not limited to the numerical simulations of compressible turbulent flows, aeroacoustics, elastic-plastic wave propagation, and electromagnetics. It alleviates obstacles to their use on modern high performance computing hardware, where memory and computational power are distributed across nodes with multi-threaded processing units. (c) 2022 Elsevier Inc. All rights reserved.

引用

页数：16

共 50 条

[21] Scalable parallel algorithms for sparse linear systems
Gupta, A
Karypis, G
Kumar, V
PARALLEL COMPUTING IN OPTIMIZATION, 1997, 7 : 73 - 98
[22] Scalable Parallel Numerical CSP Solver
Ishii, Daisuke
Yoshizoe, Kazuki
Suzumura, Toyotaro
PRINCIPLES AND PRACTICE OF CONSTRAINT PROGRAMMING, CP 2014, 2014, 8656 : 398 - 406
[23] The design, implementation, and evaluation of a symmetric banded linear solver for distributed-memory parallel computers
Gupta, A
Gustavson, FG
Joshi, M
Toledo, S
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1998, 24 (01): : 74 - 101
[24] Parallel Solution of Narrow Banded Diagonally Dominant Linear Systems
Mikkelsen, Carl Christian Kjelgaard
Kagstrom, Bo
APPLIED PARALLEL AND SCIENTIFIC COMPUTING, PT II, 2012, 7134 : 280 - 290
[25] PARALLEL DIRECT METHODS FOR SOLVING BANDED LINEAR-SYSTEMS
SAAD, Y
SCHULTZ, MH
LINEAR ALGEBRA AND ITS APPLICATIONS, 1987, 88-9 : 623 - 650
[26] A PARALLEL ELIMINATION METHOD FOR THE SOLUTION OF BANDED LINEAR-SYSTEMS
CHAWLA, MM
PASSI, K
INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 1994, 50 (3-4) : 197 - 201
[27] Tridiagonal splittings in the conditioning and parallel solution of banded linear systems
Lopez, L
Politi, T
LINEAR ALGEBRA AND ITS APPLICATIONS, 1997, 251 : 249 - 265
[28] THE COMPUTATION AND COMMUNICATION COMPLEXITY OF A PARALLEL BANDED SYSTEM SOLVER
LAWRIE, DH
SAMEH, AH
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1984, 10 (02): : 185 - 195
[29] A parallel hybrid banded system solver: the SPIKE algorithm
Polizzi, E
Sameh, AH
PARALLEL COMPUTING, 2006, 32 (02) : 177 - 194
[30] Task-Based Sparse Hybrid Linear Solver for Distributed Memory Heterogeneous Architectures
Agullo, Emmanuel
Giraud, Luc
Nakov, Stojce
EURO-PAR 2016: PARALLEL PROCESSING WORKSHOPS, 2017, 10104 : 83 - 95

← 1 2 3 4 5 →