Elegante: A Machine Learning-Based Threads Configuration Tool for SpMV Computations on Shared Memory Architecture

被引:0
|
作者
Ahmad, Muhammad [1 ]
Sardar, Usman [2 ]
Batyrshin, Ildar [1 ]
Hasnain, Muhammad [3 ]
Sajid, Khan [4 ]
Sidorov, Grigori [1 ]
机构
[1] Inst Politecn Nacl CIC PN, Ctr Invest Comp, Mexico City 07738, Mexico
[2] Inst Arts & Culture, Sch Informat & Robot, Lahore 54000, Pakistan
[3] Lahore Leads Univ, Dept Comp Sci, Lahore 54000, Pakistan
[4] Zhejiang Normal Univ, Coll Comp Sci & Technol, Jinhua 321004, Peoples R China
关键词
CSR; machine learning; SVM; high-performance computing; parallel computing; OpenMPI; shared memory;
D O I
10.3390/info15110685
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The sparse matrix-vector product (SpMV) is a fundamental computational kernel utilized in a diverse range of scientific and engineering applications. It is commonly used to solve linear and partial differential equations. The parallel computation of the SpMV product is a challenging task. Existing solutions often employ a fixed number of threads assignment to rows based on empirical formulas, leading to sub-optimal configurations and significant performance losses. Elegante, our proposed machine learning-powered tool, utilizes a data-driven approach to identify the optimal thread configuration for SpMV computations within a shared memory architecture. It accomplishes this by predicting the best thread configuration based on the unique sparsity pattern of each sparse matrix. Our approach involves training and testing using various base and ensemble machine learning algorithms such as decision tree, random forest, gradient boosting, logistic regression, and support vector machine. We rigorously experimented with a dataset of nearly 1000+ real-world matrices. These matrices originated from 46 distinct application domains, spanning fields like robotics, power networks, 2D/3D meshing, and computational fluid dynamics. Our proposed methodology achieved 62% of the highest achievable performance and is 7.33 times faster, demonstrating a significant disparity from the default OpenMP configuration policy and traditional practice methods of manually or randomly selecting the number of threads. This work is the first attempt where the structure of the matrix is used to predict the optimal thread configuration for the optimization of parallel SpMV computation in a shared memory environment.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] AAQAL: A Machine Learning-Based Tool for Performance Optimization of Parallel SPMV Computations Using Block CSR
    Ahmed, Muhammad
    Usman, Sardar
    Shah, Nehad Ali
    Ashraf, M. Usman
    Alghamdi, Ahmed Mohammed
    Bahadded, Adel A.
    Almarhabi, Khalid Ali
    APPLIED SCIENCES-BASEL, 2022, 12 (14):
  • [2] ZAKI plus : A Machine Learning Based Process Mapping Tool for SpMV Computations on Distributed Memory Architectures
    Usman, Sardar
    Mehmood, Rashid
    Katib, Iyad
    Albeshri, Aiiad
    IEEE ACCESS, 2019, 7 : 81279 - 81296
  • [3] DIESEL: A novel deep learning-based tool for SpMV computations and solving sparse linear equation systems
    Mohammed, Thaha
    Albeshri, Aiiad
    Katib, Iyad
    Mehmood, Rashid
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (06): : 6313 - 6355
  • [4] DIESEL: A novel deep learning-based tool for SpMV computations and solving sparse linear equation systems
    Thaha Mohammed
    Aiiad Albeshri
    Iyad Katib
    Rashid Mehmood
    The Journal of Supercomputing, 2021, 77 : 6313 - 6355
  • [5] Machine Learning-Based Kernel Selector for SpMV Optimization in Graph Analysis
    Xiao, Guoqing
    Zhou, Tao
    Chen, Yuedan
    Hu, Yikun
    Li, Kenli
    ACM TRANSACTIONS ON PARALLEL COMPUTING, 2024, 11 (02)
  • [6] Revisiting thread configuration of SpMV kernels on GPU: A machine learning based approach
    Gao, Jianhua
    Ji, Weixing
    Liu, Jie
    Wang, Yizhuo
    Shi, Feng
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2024, 185
  • [7] A Machine Learning-Based Approach for Selecting SpMV Kernels and Matrix Storage Formats
    Cui, Hang
    Hirasawa, Shoichi
    Kobayashi, Hiroaki
    Takizawa, Hiroyuki
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (09) : 2307 - 2314
  • [8] Evaluating a Machine Learning-based Approach for Cache Configuration
    Ribeiro, Lucas
    Jacobi, Ricardo
    Junior, Francisco
    da Silva, Jones Yudi
    Silva, Ivan Saraiva
    2022 IEEE 13TH LATIN AMERICAN SYMPOSIUM ON CIRCUITS AND SYSTEMS (LASCAS), 2022, : 180 - 183
  • [9] Reinforcement learning-based architecture search for quantum machine learning
    Rapp, Frederic
    Kreplin, David A.
    Huber, Marco F.
    Roth, Marco
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2025, 6 (01):
  • [10] Machine Learning-Based Configuration Parameter Tuning on Hadoop System
    Chen, Chi-Ou
    Zhuo, Ye-Qi
    Yeh, Chao-Chun
    Lin, Che-Min
    Liao, Shih-wei
    2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 386 - 392