IRIS-BLAS: Towards a Performance Portable and Heterogeneous BLAS Library

被引:5
|
作者
Miniskar, Narasinga Rao [1 ]
Monil, Mohammad Alaul Haque [1 ]
Valero-Lara, Pedro [1 ]
Liu, Frank [1 ]
Vetter, Jeffrey S. [1 ]
机构
[1] Oak Ridge Natl Lab, Comp Sci & Math Div, Oak Ridge, TN 37830 USA
来源
2022 IEEE 29TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS, HIPC | 2022年
关键词
Performance Portable; Heterogeneity; IRIS; BLAS; Tasking;
D O I
10.1109/HiPC56025.2022.00042
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents IRIS-BLAS, a novel heterogeneous and performance portable BLAS library.IRIS-BLAS is built on top of the IRIS runtime and multiple vendor and open-source BLAS libraries. It can transparently use all the architectures/devices available in a heterogeneous system, using the appropriate BLAS library based on the task mapping at run time. Thus, IRIS-BLAS is portable across a broad spectrum of architectures and BLAS libraries, alleviating the worry of application developers about modifying the application source code. Even though the emphasis is on portability, IRIS-BLAS provides competitive or even better performance than other state-of-the-art references. Moreover, IRIS-BLAS offers new features such as efficiently using extremely heterogeneous systems composed of multiple GPUs from different hardware vendors.
引用
收藏
页码:256 / 261
页数:6
相关论文
共 50 条
  • [31] The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems
    Dongarra, Jack
    Hammarling, Sven
    Higham, Nicholas J.
    Relton, Samuel D.
    Valero-Lara, Pedro
    Zounon, Mawussi
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS 2017), 2017, 108 : 495 - 504
  • [32] TOWARDS A CONVERSATIONAL POETICS IN THE WRITING OF OTERO,BLAS,DE - THEORETICAL AND STRUCTURAL APPROACHES
    SCARANO, LR
    CONFLUENCIA-REVISTA HISPANICA DE CULTURA Y LITERATURA, 1995, 11 (01): : 60 - 74
  • [33] Design of a high-performance tensor-matrix multiplication with BLAS
    Bassoy, Cem Savas
    JOURNAL OF COMPUTATIONAL SCIENCE, 2025, 87
  • [34] Improving Performance of Triangular Matrix-Vector BLAS Routines on GPUs
    Karwacki, Marek
    Stpiczynski, Przemyslaw
    APPLICATIONS, TOOLS AND TECHNIQUES ON THE ROAD TO EXASCALE COMPUTING, 2012, 22 : 405 - 412
  • [35] High-performance BLAS formulation of the adaptive Fast Multipole Method
    Coulaud, O.
    Fortin, P.
    Roman, J.
    Advances in Computational Methods in Sciences and Engineering 2005, Vols 4 A & 4 B, 2005, 4A-4B : 1796 - 1799
  • [36] Design of a High-Performance Tensor-Vector Multiplication with BLAS
    Bassoy, Cem
    COMPUTATIONAL SCIENCE - ICCS 2019, PT I, 2019, 11536 : 32 - 45
  • [37] MAGMA: Enabling exascale performance with accelerated BLAS and LAPACK for diverse GPU architectures
    Abdelfattah, Ahmad
    Beams, Natalie
    Carson, Robert
    Ghysels, Pieter
    Kolev, Tzanio
    Stitt, Thomas
    Vargas, Arturo
    Tomov, Stanimire
    Dongarra, Jack
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2024, 38 (05): : 468 - 490
  • [38] FitenBLAS: High performance BLAS for a massively multithreaded FT1000 processor
    Chi, Li-Hua
    Liu, Jie
    Yan, Yi-Hui
    Xie, Lin-Chuan
    Gan, Xin-Biao
    Hu, Qin-Feng
    Jiang, Jie
    Li, Sheng-Guo
    Hunan Daxue Xuebao/Journal of Hunan University Natural Sciences, 2015, 42 (04): : 100 - 106
  • [39] IRIS: A Performance-Portable Framework for Cross-Platform Heterogeneous Computing
    Kim, Jungwon
    Lee, Seyong
    Johnston, Beau
    Vetter, Jeffrey S.
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (10) : 1796 - 1809
  • [40] GEMMW - A PORTABLE LEVEL-3 BLAS WINOGRAD VARIANT OF STRASSENS MATRIX-MATRIX MULTIPLY ALGORITHM
    DOUGLAS, CC
    HEROUX, M
    SLISHMAN, G
    SMITH, RM
    JOURNAL OF COMPUTATIONAL PHYSICS, 1994, 110 (01) : 1 - 10