An efficient parallel algorithm for O(N2) direct summation method and its variations on distributed-memory parallel machines

被引：27

作者：

Makino, J ^{[1
]}

机构：

[1] Univ Tokyo, Sch Sci, Dept Astron, Bunkyo Ku, Tokyo 1130033, Japan

来源：

NEW ASTRONOMY | 2002年 / 7卷 / 07期

基金：

日本学术振兴会;

关键词：

celestial mechanics; stellar dynamics; methods : numerical;

D O I：

10.1016/S1384-1076(02)00143-4

中图分类号：

P1 [天文学];

学科分类号：

0704 ;

摘要：

We present a novel, highly efficient algorithm to parallelize O(N-2) direct summation method for N-body problems with individual timesteps on distributed-memory parallel machines such as Beowulf clusters. Previously known algorithms, in which all processors have complete copies of the N-body system, has the serious problem that the communication-computation ratio increases as we increase the number of processors, since the communication cost is independent of the number of processors. In the new algorithm, p processors are organized as a rootp x rootp two-dimensional array. Each processor has N/rootp particles, but the data are distributed in such a way that complete system is presented if we look at any row or column consisting of rootp processors. In this algorithm, the communication cost scales as N/rootp, while the calculation cost scales as N-2/p. Thus, we can use a much larger number of processors without losing efficiency compared to what was practical with previously known algorithms. (C) 2002 Elsevier Science B.V. All rights reserved.

引用

页码：373 / 384

页数：12

共 50 条

[1] PARALLEL TALBOT ALGORITHM FOR DISTRIBUTED-MEMORY MACHINES
DEROSA, MA
GIUNTA, G
RIZZARDI, M
PARALLEL COMPUTING, 1995, 21 (05) : 783 - 801
[2] An O(N) distributed-memory parallel direct solver for planar integral equations
Liang, Tianyu
Chen, Chao
Martinsson, Per-Gunnar
Biros, George
PROCEEDINGS 2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, IPDPS 2024, 2024, : 440 - 452
[3] IMPLEMENTATION OF A PARALLEL DIRECT SCF ALGORITHM ON DISTRIBUTED-MEMORY COMPUTERS
FURLANI, TR
KING, HF
JOURNAL OF COMPUTATIONAL CHEMISTRY, 1995, 16 (01) : 91 - 104
[4] PARALLEL COMPUTATION OF GROBNER BASES ON DISTRIBUTED-MEMORY MACHINES
SAWADA, H
TERASAKI, S
AIBA, A
JOURNAL OF SYMBOLIC COMPUTATION, 1994, 18 (03) : 207 - 222
[5] Parallel FP-LAPW for distributed-memory machines
Dohmen, R
Pichlmeier, J
Petersen, M
Wagner, F
Scheffler, M
COMPUTING IN SCIENCE & ENGINEERING, 2001, 3 (04) : 18 - 29
[6] Efficient Breadth-First Search on Massively Parallel and Distributed-Memory Machines
Ueno K.
Suzumura T.
Maruyama N.
Fujisawa K.
Matsuoka S.
Data Science and Engineering, 2017, 2 (1) : 22 - 35
[7] An interleaving transformation for parallelizing reductions for distributed-memory parallel machines
Wu, JJ
JOURNAL OF SUPERCOMPUTING, 2000, 15 (03): : 321 - 339
[8] An Interleaving Transformation for Parallelizing Reductions for Distributed-Memory Parallel Machines
Jan-Jan Wu
The Journal of Supercomputing, 2000, 15 : 321 - 339
[9] TDR: A distributed-memory parallel routing algorithm for FPGAs
Cabral, LAF
Aude, RS
Maculan, N
FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, PROCEEDINGS: RECONFIGURABLE COMPUTING IS GOING MAINSTREAM, 2002, 2438 : 263 - 270
[10] New parallel scheduling algorithm on distributed-memory systems
Lu, G.H.
Sun, S.X.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2001, 38 (02):

← 1 2 3 4 5 →