Hybrid static/dynamic scheduling for already optimized dense matrix factorization

被引:9
|
作者
Donfack, Simplice [1 ]
Grigori, Laura [1 ]
Gropp, William D. [2 ]
Kale, Vivek [2 ]
机构
[1] Univ Paris 11, INRIA Saclay Ile France, Bat 425, F-91405 Orsay, France
[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
关键词
dynamic scheduling; communication-avoiding; LU factorization; numerical linear algebra; LOCALITY;
D O I
10.1109/IPDPS.2012.53
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present the use of a hybrid static/dynamic scheduling strategy of the task dependency graph for direct methods used in dense numerical linear algebra. This strategy provides a balance of data locality, load balance, and low dequeue overhead. We show that the usage of this scheduling in communication avoiding dense factorization leads to significant performance gains. On a 48 core AMD Opteron NUMA machine, our experiments show that we can achieve up to 64% improvement over a version of CALU that uses fully dynamic scheduling, and up to 30% improvement over the version of CALU that uses fully static scheduling. On a 16-core Intel Xeon machine, our hybrid static/dynamic scheduling approach is up to 8% faster than the version of CALU that uses a fully static scheduling or fully dynamic scheduling. Our algorithm leads to speedups over the corresponding routines for computing LU factorization in well known libraries. On the 48 core AMD NUMA machine, our best implementation is up to 110% faster than MKL, while on the 16 core Intel Xeon machine, it is up to 82% faster than MKL. Our approach also shows significant speedups compared with PLASMA on both of these systems.
引用
收藏
页码:496 / 507
页数:12
相关论文
共 50 条
  • [31] Static scheduling of the LU factorization with look-ahead on asymmetric multicore processors
    Catalan, Sandra
    Herrero, Jose R.
    Quintana-Orti, Enrique S.
    Rodriguez-Sanchez, Rafael
    PARALLEL COMPUTING, 2018, 76 : 18 - 27
  • [32] Modeling Temporal Adoptions Using Dynamic Matrix Factorization
    Chua, Freddy Chong Tat
    Oentaryo, Richard J.
    Lim, Ee-Peng
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 91 - 100
  • [33] Robust Online Matrix Factorization for Dynamic Background Subtraction
    Yong, Hongwei
    Meng, Deyu
    Zuo, Wangmeng
    Zhang, Lei
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (07) : 1726 - 1740
  • [34] Fast similarity factorization for solving matrix dynamic equation
    Koganezawa, K
    JSME INTERNATIONAL JOURNAL SERIES C-MECHANICAL SYSTEMS MACHINE ELEMENTS AND MANUFACTURING, 2003, 46 (02) : 483 - 491
  • [36] Factorization and job scheduling: A connection via companion based matrix functions
    Bart, H
    Kroon, LG
    LINEAR ALGEBRA AND ITS APPLICATIONS, 1996, 248 : 111 - 136
  • [37] Hybrid static/dynamic activity analysis
    Kreaseck, Barbara
    Ramos, Luis
    Easterday, Scott
    Strout, Michelle
    Hovland, Paul
    COMPUTATIONAL SCIENCE - ICCS 2006, PT 4, PROCEEDINGS, 2006, 3994 : 582 - 590
  • [38] Static-dynamic hybrid communication scheduling and control co-design for networked control systems
    Wen, Shixi
    Guo, Ge
    ISA TRANSACTIONS, 2017, 71 : 553 - 562
  • [39] A novel algorithm of optimal matrix partitioning for parallel dense factorization on heterogeneous processors
    Lastovetsky, Alexey
    Reddy, Ravi
    PARALLEL COMPUTING TECHNOLOGIES, PROCEEDINGS, 2007, 4671 : 261 - +
  • [40] Dynamic cache optimized algorithm of static materialized views
    Department of Computer Science and Engineering, Southeast University, Nanjing 210096, China
    不详
    Ruan Jian Xue Bao, 2006, 5 (1213-1221):