A Runtime System for Programming Out-of-Core Matrix Algorithms-by-Tiles on Multithreaded Architectures

被引:10
|
作者
Quintana-Orti, Gregorio [1 ]
Igual, Francisco D. [1 ]
Marques, Mercedes [1 ]
Quintana-Orti, Enrique S. [1 ]
van de Geijn, Robert A. [2 ]
机构
[1] Univ Jaume 1, Dept Ingn & Ciencia Comp, Castellon de La Plana 12071, Spain
[2] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
来源
关键词
Algorithms; Performance; High-performance; libraries; linear algebra; multithreaded architectures; out-of-core algorithms; HIGH-PERFORMANCE; COMPUTATION;
D O I
10.1145/2331130.2331133
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Out-of-core implementations of algorithms for dense matrix computations have traditionally focused on optimal use of memory so as to minimize I/O, often trading programmability for performance. In this article we show how the current state of hardware and software allows the programmability problem to be addressed without sacrificing performance. This comes from the realizations that memory is cheap and large, making it less necessary to optimally orchestrate I/O, and that new algorithms view matrices as collections of submatrices and computation as operations with those submatrices. This enables libraries to be coded at a high level of abstraction, leaving the tasks of scheduling the computations and data movement in the hands of a runtime system. This is in sharp contrast to more traditional approaches that leverage optimal use of in-core memory and, at the expense of introducing considerable programming complexity, explicit overlap of I/O with computation. Performance is demonstrated for this approach on multicore architectures as well as platforms equipped with hardware accelerators.
引用
收藏
页数:25
相关论文
共 50 条
  • [41] GraphSD: A State and Dependency aware Out-of-Core Graph Processing System
    Xu, Xianghao
    Jiang, Hong
    Wang, Fang
    Cheng, Yongli
    Fang, Peng
    51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
  • [42] EXOCHI: Architecture and programming environment for a heterogeneous multi-core multithreaded system
    Wang, Perry H.
    Collins, Jamison D.
    Chinya, Gautham N.
    Hong Jiang
    Xinmin Tian
    Girkar, Milind
    Yang, Nick Y.
    Lueh, Guei-Yuan
    Wang, Hong
    ACM SIGPLAN NOTICES, 2007, 42 (06) : 156 - 166
  • [43] EXOCHI: Architecture and Programming Environment for A Heterogeneous Multi-core Multithreaded System
    Wang, Perry H.
    Collins, Jamison D.
    Chinya, Gautham N.
    Jiang, Hong
    Tian, Xinmin
    Girkar, Milind
    Yang, Nick Y.
    Lueh, Guei-Yuan
    Wang, Hong
    PLDI'07: PROCEEDINGS OF THE 2007 ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION, 2007, : 156 - 166
  • [44] OUT-OF-CORE FUEL-MANAGEMENT OPTIMIZATION UTILIZING THE INTEGER LINEAR-PROGRAMMING TECHNIQUE
    HAQ, S
    TURINSKY, PJ
    TRANSACTIONS OF THE AMERICAN NUCLEAR SOCIETY, 1985, 50 (NOV): : 95 - 96
  • [45] Testing the out-of-core monitoring system of the WWER-440 core power distribution nonuniformity
    Kamyshan, A.N.
    Kostitsin, A.R.
    Luzhnov, A.M.
    Morozov, V.V.
    Zhernov, V.S.
    Sokolov, I.V.
    Atomnaya Energiya, 1998, 84 (03): : 203 - 210
  • [46] Seeing through the window: Pre-fetching strategies for out-of-core image processing algorithms
    Pinho, R.
    Batenburg, K. J.
    Sijbers, J.
    MEDICAL IMAGING 2008: PACS AND IMAGING INFORMATICS, 2008, 6919
  • [47] Hera-JVM: A Runtime System for Heterogeneous Multi-Core Architectures
    McIlroy, Ross
    Sventek, Joe
    ACM SIGPLAN NOTICES, 2010, 45 (10) : 205 - 222
  • [48] Wonderland: A Novel Abstraction-Based Out-Of-Core Graph Processing System
    Zhang, Mingxing
    Wu, Yongwei
    Zhuo, Youwei
    Qian, Xuehai
    Huan, Chengying
    Chen, Kang
    ACM SIGPLAN NOTICES, 2018, 53 (02) : 608 - 621
  • [49] Evaluation of Flash-based Out-of-core Stencil Computation Algorithms for SSD-Equipped Clusters
    Midorikawa, Hiroko
    Tan, Hideyuki
    2016 IEEE 22ND INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2016, : 1031 - 1040
  • [50] CLIP: A Disk I/O Focused Parallel Out-of-Core Graph Processing System
    Ai, Zhiyuan
    Zhang, Mingxing
    Wu, Yongwei
    Qian, Xuehai
    Chen, Kang
    Zheng, Weimin
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (01) : 45 - 62