Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime

被引:2
|
作者
Yamazaki, Ichitaro [1 ]
Kurzak, Jakub [1 ]
Luszczek, Piotr [1 ]
Dongarra, Jack [1 ,2 ,3 ]
机构
[1] Univ Tennessee, Knoxville, TN 37996 USA
[2] Oak Ridge Natl Lab, Oak Ridge, TN 37831 USA
[3] Univ Manchester, Manchester M13 9PL, Lancs, England
基金
美国国家科学基金会;
关键词
systolic array; QR decomposition; multithreading; message-passing; dataflow; runtime;
D O I
10.1109/IPDPSW.2014.167
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A systolic array provides an alternative computing paradigm to the von Neuman architecture. Though its hardware implementation has failed as a paradigm to design integrated circuits in the past, we are now discovering that the systolic array as a software virtualization layer can lead to an extremely scalable execution paradigm. To demonstrate this scalability, in this paper, we design and implement a 3D virtual systolic array to compute a tile QR decomposition of a tall-and-skinny dense matrix. Our implementation is based on a state-of-the-art algorithm that factorizes a panel based on a tree-reduction. Using a runtime developed as a part of the Parallel Ultra Light Systolic Array Runtime (PULSAR) project, we demonstrate on a Cray-XT5 machine how our virtual systolic array can be mapped to a large-scale machine and obtain excellent parallel performance. This is an important contribution since such a QR decomposition is used, for example, to compute a least squares solution of an overdetermined system, which arises in many scientific and engineering problems.
引用
收藏
页码:1495 / 1504
页数:10
相关论文
共 50 条
  • [31] A Review of the Extruder System Design for Large-Scale Extrusion-Based 3D Concrete Printing
    Chen, Hao
    Zhang, Daobo
    Chen, Peng
    Li, Ning
    Perrot, Arnaud
    MATERIALS, 2023, 16 (07)
  • [32] FABRICATING TOPOLOGICALLY OPTIMIZED TREE-LIKE PAVILIONS USING LARGE-SCALE ROBOTIC 3D PRINTING TECHNIQUES
    Bao, Ding Wen
    Yan, Xin
    Xie, Yi Min
    JOURNAL OF THE INTERNATIONAL ASSOCIATION FOR SHELL AND SPATIAL STRUCTURES, 2022, 63 (02): : 122 - 131
  • [33] Learning-Based Reflection-Aware Virtual Point Removal for Large-Scale 3D Point Clouds
    Lee, Oggyu
    Joo, Kyungdon
    Sim, Jae-Young
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (12) : 8510 - 8517
  • [34] Implementation of An Efficient Workflow for the Analysis of Alerts Observed During Large Scale EPID-Based 3D in Vivo Dosimetry
    Mijnheer, B.
    van Mourik, A.
    Ruiz, I. Olaciregui
    Mans, A.
    MEDICAL PHYSICS, 2017, 44 (06) : 2989 - 2989
  • [35] Continuous, Free-Formable Sandwich Design with 3D Fiber Reinforced Core for Increased Lightweight Level of Applications in Large-Scale Production
    Schaefer, Kay
    Stiller, Jonas
    Troeltzsch, Juergen
    Nestler, Daisy
    Kroll, Lothar
    ADVANCED ENGINEERING MATERIALS, 2019, 21 (04)
  • [36] Large-Scale Antitumor Screening Based on Heterotypic 3D Tumors Using an Integrated Microfluidic Platform
    Liu, Wenming
    Sun, Meilin
    Han, Kai
    Wang, Jinyi
    ANALYTICAL CHEMISTRY, 2019, 91 (21) : 13601 - 13610
  • [37] Review of Decision Tree-Based Binary Classification Framework Using Robust 3D Image and Feature Selection for Malaria-Infected Erythrocyte Detection
    Ali, Syed Azar
    Kumar, S. Phani
    DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT-2K19, 2020, 1079 : 759 - 780
  • [38] Accurate Multiple View 3D Reconstruction Using Patch-Based Stereo for Large-Scale Scenes
    Shen, Shuhan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 22 (05) : 1901 - 1914
  • [39] Experimental Validation of Injection Molding Simulations of 3D Microparts and Microstructured Components Using Virtual Design of Experiments and Multi-Scale Modeling
    Loaldi, Dario
    Regi, Francesco
    Baruffi, Federico
    Calaon, Matteo
    Quagliotti, Danilo
    Zhang, Yang
    Tosello, Guido
    MICROMACHINES, 2020, 11 (06) : 1 - 17
  • [40] Web-based e-learning in 3D large scale distributed interactive simulations using HLA/RTI
    Ahmad, L.
    Boukerche, A.
    Al Hamidi, A.
    Shadid, A.
    Pazzi, R.
    2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, 2008, : 3232 - 3235