An efficient parallel algorithm for O(N2) direct summation method and its variations on distributed-memory parallel machines

被引:27
|
作者
Makino, J [1 ]
机构
[1] Univ Tokyo, Sch Sci, Dept Astron, Bunkyo Ku, Tokyo 1130033, Japan
来源
NEW ASTRONOMY | 2002年 / 7卷 / 07期
基金
日本学术振兴会;
关键词
celestial mechanics; stellar dynamics; methods : numerical;
D O I
10.1016/S1384-1076(02)00143-4
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
We present a novel, highly efficient algorithm to parallelize O(N-2) direct summation method for N-body problems with individual timesteps on distributed-memory parallel machines such as Beowulf clusters. Previously known algorithms, in which all processors have complete copies of the N-body system, has the serious problem that the communication-computation ratio increases as we increase the number of processors, since the communication cost is independent of the number of processors. In the new algorithm, p processors are organized as a rootp x rootp two-dimensional array. Each processor has N/rootp particles, but the data are distributed in such a way that complete system is presented if we look at any row or column consisting of rootp processors. In this algorithm, the communication cost scales as N/rootp, while the calculation cost scales as N-2/p. Thus, we can use a much larger number of processors without losing efficiency compared to what was practical with previously known algorithms. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:373 / 384
页数:12
相关论文
共 50 条
  • [31] An O(n2) parallel algorithm for the maximum flow problem
    Tabirca, S
    Tabirca, T
    Concurrent Information Processing and Computing, 2005, 195 : 294 - 300
  • [32] Parallel Asynchronous Distributed-Memory Maximal Independent Set Algorithm with Work Ordering
    Kanewala, Thejaka
    Zalewski, Marcin
    Lumsdaine, Andrew
    2017 IEEE 24TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2017, : 52 - 61
  • [33] Efficient all-to-all broadcast schemes in distributed-memory parallel computers
    Oh, ES
    Kanj, IA
    16TH ANNUAL INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTING SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2002, : 71 - 76
  • [34] Performance of the Parallel One-Sided Block Jacobi SVD Algorithm on a Modern Distributed-Memory Parallel Computer
    Kudo, Shuhei
    Yamamoto, Yusaku
    Becka, Martin
    Vajtersic, Marian
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, PPAM 2015, PT I, 2016, 9573 : 594 - 604
  • [35] PARALLEL COMPUTATION OF THE MP2 ENERGY ON DISTRIBUTED-MEMORY COMPUTERS
    MARQUEZ, AM
    DUPUIS, M
    JOURNAL OF COMPUTATIONAL CHEMISTRY, 1995, 16 (04) : 395 - 404
  • [36] A GENERAL PARALLEL SOLUTION TO THE INTEGRAL TRANSFORMATION AND 2ND-ORDER MOLLER-PLESSET ENERGY EVALUATION ON DISTRIBUTED-MEMORY PARALLEL MACHINES
    LIMAYE, AC
    GADRE, SR
    JOURNAL OF CHEMICAL PHYSICS, 1994, 100 (02): : 1303 - 1307
  • [37] A FULLY PARALLEL CONDENSATION METHOD FOR GENERALIZED EIGENVALUE PROBLEMS ON DISTRIBUTED-MEMORY COMPUTERS
    ROTHE, K
    VOSS, H
    PARALLEL COMPUTING, 1995, 21 (06) : 907 - 921
  • [38] An effective garbage collection strategy for parallel programming languages on large scale distributed-memory machines
    Taura, K
    Yonezawa, A
    ACM SIGPLAN NOTICES, 1997, 32 (07) : 264 - 275
  • [39] Task scheduling algorithm to package messages on distributed memory parallel machines
    Fujimoto, Noriyuki
    Baba, Tomoki
    Hashimoto, Takashi
    Hagihara, Kenichi
    Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks, I-SPAN, 1999, : 236 - 241
  • [40] An imperialist competitive algorithm with memory for distributed unrelated parallel machines scheduling
    Lei, Deming
    Yuan, Yue
    Cai, Jingcao
    Bai, Danyu
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2020, 58 (02) : 597 - 614