CAMLB-SpMV: An Efficient Cache-Aware Memory Load-Balancing SpMV on CPU

被引:0
|
作者
Guo, Jihu [1 ]
Xia, Rui [1 ]
Zhu, Xiaoxiong [1 ]
Zhang, Xiang [1 ]
Liu, Jie [1 ]
机构
[1] Natl Univ Def Technol, Changsha, Hunan, Peoples R China
关键词
CPU; SpMV; Load balance; Cache line; MATRIX-VECTOR MULTIPLICATION;
D O I
10.1145/3673038.3673042
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Sparse Matrix-Vector Multiplication (SpMV) plays a crucial role in scientific computing, but severe load imbalance among threads restricts its performance. Previous load-balancing methods have primarily ignored the CPU's cache line-based memory access characteristics and the impact of data locality during workload evaluation and partitioning, leading to limited effect in load balancing. To address this issue, we propose a cache-aware memory load-balancing SpMV algorithm, CAMLB-SpMV, based on the Compressed Sparse Row (CSR) format. We evaluate all memory access loads of CSR-based SpMV at the cache line level and utilize a sliding window to record accesses to chi, enabling the load evaluation to perceive data locality. Finally, the total workload is evenly distributed to threads to achieve load-balancing. Experimental results on 2661 sparse matrices from SuiteSparse sparse matrix dataset demonstrate that CAMLB-SpMV surpasses Intel MKL, Merge-Based, CSR5-AVX512, and CVR by an average factor of 1.16x, 1.19x, 2.16x, and 1.17x (up to 13.77x, 7.45x, 15.89x, and 8.63x), respectively, on Intel Xeon Platinum 9242. Moreover, it outperforms AMD AOCL, Merge-Based, and CSR5-AVX2 by an average factor of 2.70x, 1.62x, and 3.40x (up to 25.03x, 4.36x, and 13.7x) on AMD EPYC 7542.
引用
收藏
页码:640 / 649
页数:10
相关论文
共 4 条
  • [1] Cache-aware load-balancing mechanisms for synchronous computations on shared-memory multiprocessors
    Vee, VY
    Hsu, WJ
    IEEE 2000 TENCON PROCEEDINGS, VOLS I-III: INTELLIGENT SYSTEMS AND TECHNOLOGIES FOR THE NEW MILLENNIUM, 2000, : A4 - A9
  • [2] Cache-aware load balancing of data center applications
    Archer, Aaron
    Aydin, Kevin
    Bateni, Mohammad Hossein
    Mirrokni, Vahab
    Schild, Aaron
    Yang, Ray
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (06): : 709 - 723
  • [3] Cache-aware load balancing vs. cooperative caching for distributed search engines
    Dominguez-Sal, David
    Perez-Casany, Marta
    Larriba-Pey, Josep Lluis
    HPCC: 2009 11TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2009, : 415 - +
  • [4] Efficient Load-Balancing Aware Cloud Resource Scheduling for Mobile User
    Li Chunlin
    Zhou Min
    Luo Youlong
    COMPUTER JOURNAL, 2017, 60 (06): : 925 - 939