An efficient parallel algorithm for O(N2) direct summation method and its variations on distributed-memory parallel machines

被引：27

作者：

Makino, J ^{[1
]}

机构：

[1] Univ Tokyo, Sch Sci, Dept Astron, Bunkyo Ku, Tokyo 1130033, Japan

来源：

NEW ASTRONOMY | 2002年 / 7卷 / 07期

基金：

日本学术振兴会;

关键词：

celestial mechanics; stellar dynamics; methods : numerical;

D O I：

10.1016/S1384-1076(02)00143-4

中图分类号：

P1 [天文学];

学科分类号：

0704 ;

摘要：

We present a novel, highly efficient algorithm to parallelize O(N-2) direct summation method for N-body problems with individual timesteps on distributed-memory parallel machines such as Beowulf clusters. Previously known algorithms, in which all processors have complete copies of the N-body system, has the serious problem that the communication-computation ratio increases as we increase the number of processors, since the communication cost is independent of the number of processors. In the new algorithm, p processors are organized as a rootp x rootp two-dimensional array. Each processor has N/rootp particles, but the data are distributed in such a way that complete system is presented if we look at any row or column consisting of rootp processors. In this algorithm, the communication cost scales as N/rootp, while the calculation cost scales as N-2/p. Thus, we can use a much larger number of processors without losing efficiency compared to what was practical with previously known algorithms. (C) 2002 Elsevier Science B.V. All rights reserved.

引用

页码：373 / 384

页数：12

共 50 条

[31] An O(n2) parallel algorithm for the maximum flow problem
Tabirca, S
Tabirca, T
Concurrent Information Processing and Computing, 2005, 195 : 294 - 300
[32] Parallel Asynchronous Distributed-Memory Maximal Independent Set Algorithm with Work Ordering
Kanewala, Thejaka
Zalewski, Marcin
Lumsdaine, Andrew
2017 IEEE 24TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2017, : 52 - 61
[33] Efficient all-to-all broadcast schemes in distributed-memory parallel computers
Oh, ES
Kanj, IA
16TH ANNUAL INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTING SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2002, : 71 - 76
[34] Performance of the Parallel One-Sided Block Jacobi SVD Algorithm on a Modern Distributed-Memory Parallel Computer
Kudo, Shuhei
Yamamoto, Yusaku
Becka, Martin
Vajtersic, Marian
PARALLEL PROCESSING AND APPLIED MATHEMATICS, PPAM 2015, PT I, 2016, 9573 : 594 - 604
[35] PARALLEL COMPUTATION OF THE MP2 ENERGY ON DISTRIBUTED-MEMORY COMPUTERS
MARQUEZ, AM
DUPUIS, M
JOURNAL OF COMPUTATIONAL CHEMISTRY, 1995, 16 (04) : 395 - 404
[36] A GENERAL PARALLEL SOLUTION TO THE INTEGRAL TRANSFORMATION AND 2ND-ORDER MOLLER-PLESSET ENERGY EVALUATION ON DISTRIBUTED-MEMORY PARALLEL MACHINES
LIMAYE, AC
GADRE, SR
JOURNAL OF CHEMICAL PHYSICS, 1994, 100 (02): : 1303 - 1307
[37] A FULLY PARALLEL CONDENSATION METHOD FOR GENERALIZED EIGENVALUE PROBLEMS ON DISTRIBUTED-MEMORY COMPUTERS
ROTHE, K
VOSS, H
PARALLEL COMPUTING, 1995, 21 (06) : 907 - 921
[38] An effective garbage collection strategy for parallel programming languages on large scale distributed-memory machines
Taura, K
Yonezawa, A
ACM SIGPLAN NOTICES, 1997, 32 (07) : 264 - 275
[39] Task scheduling algorithm to package messages on distributed memory parallel machines
Fujimoto, Noriyuki
Baba, Tomoki
Hashimoto, Takashi
Hagihara, Kenichi
Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks, I-SPAN, 1999, : 236 - 241
[40] An imperialist competitive algorithm with memory for distributed unrelated parallel machines scheduling
Lei, Deming
Yuan, Yue
Cai, Jingcao
Bai, Danyu
INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2020, 58 (02) : 597 - 614

← 1 2 3 4 5 →