A detailed performance analysis of the interpolation supplemented lattice Boltzmann method on the Cray T3E and Cray X1

被引:7
|
作者
Sunder, C. Shyam [1 ]
Baskar, G.
Babu, V.
Strenski, David
机构
[1] Indian Inst Technol, Dept Mech Engn, TDCE, Madras 600036, Tamil Nadu, India
[2] Cornell Univ, Sibley Sch Mech & Aerosp Engn, Mat Proc Design & Control Lab, Ithaca, NY 14853 USA
[3] ETH, Inst Energietech, CH-8092 Zurich, Switzerland
[4] Cray Inc, Seattle, WA 98104 USA
关键词
shared memory; multiprocessors; parallel computing; SHMEM; MPI;
D O I
10.1177/1094342006064572
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A detailed study of the parallel performance of the interpolation supplemented lattice Boltzmann (ISLB) method using SHMEM and MPI on the Cray T3E-900 and Cray X1 architectures is presented. The noteworthy feature of the present implementation of the ISLB method is that it is. able to achieve a sustained speed of 4.2 Tflop/s while using 504 processors on a Cray X1. The code is shown to achieve super-linear speedups on the Cray T3E-900. It is shown through detailed profiling that the computation and the communication scale well on the Cray X1, although the overall speedup is adversely affected by the cost of barrier synchronization.
引用
收藏
页码:557 / 570
页数:14
相关论文
共 50 条
  • [1] Performance analysis on CRAY T3E
    Gerndt, M
    Mohr, B
    Pantano, M
    Wolf, F
    PROCEEDINGS OF THE SEVENTH EUROMICRO WORKSHOP ON PARALLEL AND DISTRIBUTED PROCESSING, PDP'99, 1999, : 241 - 248
  • [2] Performance of parallel Gaussian 94 on the Cray T3E
    Sosa, CP
    Ochterski, J
    Carpenter, J
    Frisch, MJ
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1997, 213 : 31 - COMP
  • [3] Parallelising the unified model for the Cray T3E
    Burton, P
    Dickinson, A
    MAKING ITS MARK, 1997, : 68 - 82
  • [4] PScheD - Political Scheduling on the CRAY T3E
    Lagerstrom, RN
    Gipp, SK
    JOB SCHEDULING STRATEGIES FOR PARALLEL PROCESSING, 1997, 1291 : 117 - 138
  • [5] Parallel pivots LU algorithm on the Cray T3E
    Asenjo, R
    Zapata, EL
    PARALLEL COMPUTATION, 1999, 1557 : 38 - 47
  • [6] Running a code for lattice quantum chromodynamics efficiently on CRAY T3E systems
    Attig, N
    Güsken, S
    Lacock, P
    Lippert, T
    Schilling, K
    Ueberholz, P
    Viehoff, J
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1998, 1401 : 183 - 192
  • [7] Giant eigenproblems from lattice gauge theory on CRAY T3E systems
    Attig, N
    Lippert, T
    Neff, H
    Negele, J
    Schilling, K
    COMPUTER PHYSICS COMMUNICATIONS, 2001, 142 (1-3) : 196 - 200
  • [8] Fine-grained multithreading on the Cray T3E
    Grävinghoff, A
    Keller, J
    HIGH PERFORMANCE COMPUTING IN SCIENCE AND ENGINEERING '99, 2000, : 447 - 456
  • [9] Impact of PE mapping on Cray T3E message-passing performance
    Huedo, E
    Prieto, M
    Llorente, IM
    Tirado, F
    EURO-PAR 2000 PARALLEL PROCESSING, PROCEEDINGS, 2000, 1900 : 199 - 207
  • [10] Performance and scalability analysis of Cray X1 vectorization and multistreaming optimization
    Alam, S
    Vetter, J
    COMPUTATIONAL SCIENCE - ICCS 2005, PT 1, PROCEEDINGS, 2005, 3514 : 304 - 312