A methodology for detailed performance modeling of reduction computations on SMP machines

被引：3

作者：

Jin, RM ^{[1
]}

Agrawal, G ^{[1
]}

机构：

[1] Ohio State Univ, Dept Comp & Informat Sci, Columbus, OH 43210 USA

来源：

PERFORMANCE EVALUATION | 2005年 / 60卷 / 1-4期

基金：

美国国家科学基金会;

关键词：

parallel processing; shared memory; memory hierarchy; data mining;

D O I：

10.1016/j.peva.2004.10.017

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we revisit the problem of performance prediction on SMP machines, motivated by the need for selecting parallelization strategy for random write reductions. Such reductions frequently arise in data mining algorithms. In our previous work, we have developed a number of techniques for parallelizing this class of reductions. Our previous work has shown that each of the three techniques, full replication, optimized full locking, and cache-sensitive, can outperform others depending upon problem, dataset, and machine parameters. Therefore, an important question is, "Can we predict the performance of these techniques for a given problem, dataset, and machine?". This paper addresses this question by developing an analytical performance model that captures a two-level cache, coherence cache misses, TLB misses, locking overheads, and contention for memory. Analytical model is combined with results from micro-benchmarking to predict performance on real machines. We have validated our model on two different SMP machines. Our results show that our model effectively captures the impact of memory hierarchy (two-level cache and TLB) as well as the factors that limit parallelism (contention for locks, memory contention, and coherence cache misses). The difference between predicted and measured performance is within 20% in almost all cases. Moreover, the model is quite accurate in predicting the relative performance of the three parallelization techniques. (c) 2004 Elsevier B.V. All rights reserved.

引用

页码：73 / 105

页数：33

共 50 条

[21] A mixed methodology for detailed 3D modeling of architectural heritage
Arce, D.
Retamozo, S.
Aguilar, R.
Castaneda, B.
STRUCTURAL ANALYSIS OF HISTORICAL CONSTRUCTIONS: ANAMNESIS, DIAGNOSIS, THERAPY, CONTROLS, 2016, : 104 - 111
[22] Detailed Modeling of the Direct Reduction of Iron Ore in a Shaft Furnace
Hamadeh, Hamzeh
Mirgaux, Olivier
Patisson, Fabrice
MATERIALS, 2018, 11 (10)
[23] Analytic performance modeling and analysis of detailed neuron simulations
Cremonesi, Francesco
Hager, Georg
Wellein, Gerhard
Schuermann, Felix
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2020, 34 (04): : 428 - 449
[24] Dataset and detailed methodology for structure and performance characterization of modified polymeric membranes
Rodrigues, Raphael
Mierzwa, Jose Carlos
Vecitis, Chad D.
DATA IN BRIEF, 2020, 28
[25] Performance Modeling Tools for Parallel Sparse Linear Algebra Computations
Cicotti, Pietro
Li, Xiaoye S.
Baden, Scott B.
PARALLEL COMPUTING: FROM MULTICORES AND GPU'S TO PETASCALE, 2010, 19 : 83 - 90
[26] Modeling spectrophotometric titration data: A detailed look at optimal methodology and transparent reporting
Vander Griend, Douglas
Kazmierczak, Nathanael
ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2016, 252
[27] Modeling MPI Communication Performance on SMP Nodes: Is it Time to Retire the Ping Pong Test
Gropp, William
Olson, Luke N.
Samfass, Philipp
PROCEEDINGS OF THE 23RD EUROPEAN MPI USERS' GROUP MEETING (EUROMPI 2016), 2016, : 41 - 50
[28] EFFECT OF DETAILED SURFACE GEOMETRY ON RIBLET DRAG REDUCTION PERFORMANCE
WALSH, MJ
JOURNAL OF AIRCRAFT, 1990, 27 (06): : 572 - 573
[29] Premixed flame response to equivalence ratio fluctuations: Comparison between reduced order modeling and detailed computations
Hemchandra, Santosh
COMBUSTION AND FLAME, 2012, 159 (12) : 3530 - 3543
[30] INTERNATIONAL SEMINAR ON MODELING AND PERFORMANCE EVALUATION METHODOLOGY
不详
PERFORMANCE EVALUATION, 1983, 3 (03) : 214 - 221

← 1 2 3 4 5 →