Hybrid static/dynamic scheduling for already optimized dense matrix factorization

被引：9

作者：

Donfack, Simplice ^{[1
]}

Grigori, Laura ^{[1
]}

Gropp, William D. ^{[2
]}

Kale, Vivek ^{[2
]}

机构：

[1] Univ Paris 11, INRIA Saclay Ile France, Bat 425, F-91405 Orsay, France

[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA

来源：

2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS) | 2012年

关键词：

dynamic scheduling; communication-avoiding; LU factorization; numerical linear algebra; LOCALITY;

D O I：

10.1109/IPDPS.2012.53

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We present the use of a hybrid static/dynamic scheduling strategy of the task dependency graph for direct methods used in dense numerical linear algebra. This strategy provides a balance of data locality, load balance, and low dequeue overhead. We show that the usage of this scheduling in communication avoiding dense factorization leads to significant performance gains. On a 48 core AMD Opteron NUMA machine, our experiments show that we can achieve up to 64% improvement over a version of CALU that uses fully dynamic scheduling, and up to 30% improvement over the version of CALU that uses fully static scheduling. On a 16-core Intel Xeon machine, our hybrid static/dynamic scheduling approach is up to 8% faster than the version of CALU that uses a fully static scheduling or fully dynamic scheduling. Our algorithm leads to speedups over the corresponding routines for computing LU factorization in well known libraries. On the 48 core AMD NUMA machine, our best implementation is up to 110% faster than MKL, while on the 16 core Intel Xeon machine, it is up to 82% faster than MKL. Our approach also shows significant speedups compared with PLASMA on both of these systems.

引用

页码：496 / 507

页数：12

共 50 条

[1] DYNAMIC LEVELWISE SCHEDULING FOR SPARSE-MATRIX FACTORIZATION ON VECTOR
MONTAGNA, M
GRANELLI, GP
VUONG, GT
CHAHINE, R
ELECTRIC POWER SYSTEMS RESEARCH, 1995, 33 (03) : 185 - 192
[2] Hybrid task scheduling: Integrating static and dynamic heuristics
Boeres, C
Lima, A
Rebello, VEF
15TH SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 2003, : 199 - 206
[3] QR FACTORIZATION OF A DENSE MATRIX ON A HYPERCUBE MULTIPROCESSOR
CHU, E
GEORGE, A
SIAM JOURNAL ON SCIENTIFIC AND STATISTICAL COMPUTING, 1990, 11 (05): : 990 - 1028
[4] Partial Factorization of a Dense Symmetric Indefinite Matrix
Reid, John K.
Scott, Jennifer A.
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2011, 38 (02):
[5] Soft orthogonal non-negative matrix factorization with sparse representation: Static and dynamic
Chen, Yong
Zhang, Hui
Liu, Rui
Ye, Zhiwen
NEUROCOMPUTING, 2018, 310 : 148 - 164
[6] SCHEDULING OF HARD APERIODIC TASKS IN HYBRID STATIC/DYNAMIC PRIORITY SYSTEMS
LEE, J
LEE, S
KIM, H
SIGPLAN NOTICES, 1995, 30 (11): : 7 - 19
[7] Static versus Dynamic Task Scheduling of the LU Factorization on ARM big.LITTLE Architectures
Catalan, Sandra
Rodriguez-Sanchez, Rafael
Quintana-Orti, Enrique S.
Herrero, Jose R.
2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 733 - 742
[8] Communication reduction in multiple multicasts based on hybrid static-dynamic scheduling
Surma, DR
Sha, EHM
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2000, 11 (09) : 865 - 878
[9] Floating Point Architecture Extensions for Optimized Matrix Factorization
Pedram, Ardavan
Gerstlauer, Andreas
van de Geijn, Robert A.
2013 21ST IEEE SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH), 2013, : 49 - 58
[10] Dynamic Exponential Family Matrix Factorization
Hayashi, Kohei
Hirayama, Jun-ichiro
Ishii, Shin
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, 5476 : 452 - +

← 1 2 3 4 5 →