A detailed performance analysis of the interpolation supplemented lattice Boltzmann method on the Cray T3E and Cray X1

被引：7

作者：

Sunder, C. Shyam ^{[1
]}

Baskar, G.

Babu, V.

Strenski, David

机构：

[1] Indian Inst Technol, Dept Mech Engn, TDCE, Madras 600036, Tamil Nadu, India

[2] Cornell Univ, Sibley Sch Mech & Aerosp Engn, Mat Proc Design & Control Lab, Ithaca, NY 14853 USA

[3] ETH, Inst Energietech, CH-8092 Zurich, Switzerland

[4] Cray Inc, Seattle, WA 98104 USA

来源：

INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS | 2006年 / 20卷 / 04期

关键词：

shared memory; multiprocessors; parallel computing; SHMEM; MPI;

D O I：

10.1177/1094342006064572

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

A detailed study of the parallel performance of the interpolation supplemented lattice Boltzmann (ISLB) method using SHMEM and MPI on the Cray T3E-900 and Cray X1 architectures is presented. The noteworthy feature of the present implementation of the ISLB method is that it is. able to achieve a sustained speed of 4.2 Tflop/s while using 504 processors on a Cray X1. The code is shown to achieve super-linear speedups on the Cray T3E-900. It is shown through detailed profiling that the computation and the communication scale well on the Cray X1, although the overall speedup is adversely affected by the cost of barrier synchronization.

引用

页码：557 / 570

页数：14

共 50 条

[1] Performance analysis on CRAY T3E
Gerndt, M
Mohr, B
Pantano, M
Wolf, F
PROCEEDINGS OF THE SEVENTH EUROMICRO WORKSHOP ON PARALLEL AND DISTRIBUTED PROCESSING, PDP'99, 1999, : 241 - 248
[2] Performance of parallel Gaussian 94 on the Cray T3E
Sosa, CP
Ochterski, J
Carpenter, J
Frisch, MJ
ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1997, 213 : 31 - COMP
[3] Parallelising the unified model for the Cray T3E
Burton, P
Dickinson, A
MAKING ITS MARK, 1997, : 68 - 82
[4] PScheD - Political Scheduling on the CRAY T3E
Lagerstrom, RN
Gipp, SK
JOB SCHEDULING STRATEGIES FOR PARALLEL PROCESSING, 1997, 1291 : 117 - 138
[5] Parallel pivots LU algorithm on the Cray T3E
Asenjo, R
Zapata, EL
PARALLEL COMPUTATION, 1999, 1557 : 38 - 47
[6] Running a code for lattice quantum chromodynamics efficiently on CRAY T3E systems
Attig, N
Güsken, S
Lacock, P
Lippert, T
Schilling, K
Ueberholz, P
Viehoff, J
HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1998, 1401 : 183 - 192
[7] Giant eigenproblems from lattice gauge theory on CRAY T3E systems
Attig, N
Lippert, T
Neff, H
Negele, J
Schilling, K
COMPUTER PHYSICS COMMUNICATIONS, 2001, 142 (1-3) : 196 - 200
[8] Fine-grained multithreading on the Cray T3E
Grävinghoff, A
Keller, J
HIGH PERFORMANCE COMPUTING IN SCIENCE AND ENGINEERING '99, 2000, : 447 - 456
[9] Impact of PE mapping on Cray T3E message-passing performance
Huedo, E
Prieto, M
Llorente, IM
Tirado, F
EURO-PAR 2000 PARALLEL PROCESSING, PROCEEDINGS, 2000, 1900 : 199 - 207
[10] Performance and scalability analysis of Cray X1 vectorization and multistreaming optimization
Alam, S
Vetter, J
COMPUTATIONAL SCIENCE - ICCS 2005, PT 1, PROCEEDINGS, 2005, 3514 : 304 - 312

← 1 2 3 4 5 →