CAMLB-SpMV: An Efficient Cache-Aware Memory Load-Balancing SpMV on CPU

被引：0

作者：

Guo, Jihu ^{[1
]}

Xia, Rui ^{[1
]}

Zhu, Xiaoxiong ^{[1
]}

Zhang, Xiang ^{[1
]}

Liu, Jie ^{[1
]}

机构：

[1] Natl Univ Def Technol, Changsha, Hunan, Peoples R China

来源：

53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024 | 2024年

关键词：

CPU; SpMV; Load balance; Cache line; MATRIX-VECTOR MULTIPLICATION;

D O I：

10.1145/3673038.3673042

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Sparse Matrix-Vector Multiplication (SpMV) plays a crucial role in scientific computing, but severe load imbalance among threads restricts its performance. Previous load-balancing methods have primarily ignored the CPU's cache line-based memory access characteristics and the impact of data locality during workload evaluation and partitioning, leading to limited effect in load balancing. To address this issue, we propose a cache-aware memory load-balancing SpMV algorithm, CAMLB-SpMV, based on the Compressed Sparse Row (CSR) format. We evaluate all memory access loads of CSR-based SpMV at the cache line level and utilize a sliding window to record accesses to chi, enabling the load evaluation to perceive data locality. Finally, the total workload is evenly distributed to threads to achieve load-balancing. Experimental results on 2661 sparse matrices from SuiteSparse sparse matrix dataset demonstrate that CAMLB-SpMV surpasses Intel MKL, Merge-Based, CSR5-AVX512, and CVR by an average factor of 1.16x, 1.19x, 2.16x, and 1.17x (up to 13.77x, 7.45x, 15.89x, and 8.63x), respectively, on Intel Xeon Platinum 9242. Moreover, it outperforms AMD AOCL, Merge-Based, and CSR5-AVX2 by an average factor of 2.70x, 1.62x, and 3.40x (up to 25.03x, 4.36x, and 13.7x) on AMD EPYC 7542.

引用

页码：640 / 649

页数：10

共 4 条

[1] Cache-aware load-balancing mechanisms for synchronous computations on shared-memory multiprocessors
Vee, VY
Hsu, WJ
IEEE 2000 TENCON PROCEEDINGS, VOLS I-III: INTELLIGENT SYSTEMS AND TECHNOLOGIES FOR THE NEW MILLENNIUM, 2000, : A4 - A9
[2] Cache-aware load balancing of data center applications
Archer, Aaron
Aydin, Kevin
Bateni, Mohammad Hossein
Mirrokni, Vahab
Schild, Aaron
Yang, Ray
PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (06): : 709 - 723
[3] Cache-aware load balancing vs. cooperative caching for distributed search engines
Dominguez-Sal, David
Perez-Casany, Marta
Larriba-Pey, Josep Lluis
HPCC: 2009 11TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2009, : 415 - +
[4] Efficient Load-Balancing Aware Cloud Resource Scheduling for Mobile User
Li Chunlin
Zhou Min
Luo Youlong
COMPUTER JOURNAL, 2017, 60 (06): : 925 - 939

← 1 →