Acceleration of Large Integer Multiplication with Intel AVX-512 Instructions

被引：3

作者：

Edamatsu, Takuya ^{[1
]}

Takahashi, Daisuke ^{[2
]}

机构：

[1] Univ Tsukuba, Coll Informat Sci, Tsukuba, Ibaraki, Japan

[2] Univ Tsukuba, Ctr Computat Sci, Tsukuba, Ibaraki, Japan

来源：

IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS) | 2018年

关键词：

AVX-512; SIMD; Knights Landing; large integer multiplication; reduced-radix representation;

D O I：

10.1109/HPCC/SmartCity/DSS.2018.00059

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose an implementation of large integer multiplication using Single Instruction Multiple Data (SIMD) instructions. We evaluated the implementation on an Intel Xeon Phi processor. The second generation Intel Xeon Phi processor, Knights Landing, has a set of Advanced Vector Extensions-512 (AVX-512) instructions. Using AVX-512, the processor can handle 512 bits at the same time and has the potential to multiply faster than a processor using Streaming SIMD Extensions (SSE) and AVX. Therefore, we applied AVX-512F (foundation) instructions to the program. In the multiplication of large integers, as the number of digits increases, various processing costs also become larger. One of these costs is carry processing. Therefore, we implemented a multiplication function using a reduced-radix representation and compared the execution time and the number of instructions against the GNU Multiple Precision Arithmetic Library (GMP). Furthermore, we used some optimization techniques for this kernel. We successfully achieved an execution time that was approximately 2.5x faster than GMP on the Knights Landing architecture.

引用

页码：211 / 218

页数：8

共 50 条

[1] Accelerating Large Integer Multiplication Using Intel AVX-512IFMA
Edamatsu, Takuya
Takahashi, Daisuke
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING (ICA3PP 2019), PT I, 2020, 11944 : 60 - 74
[2] An implementation of matrix–matrix multiplication on the Intel KNL processor with AVX-512
Roktaek Lim
Yeongha Lee
Raehyun Kim
Jaeyoung Choi
Cluster Computing, 2018, 21 : 1785 - 1795
[3] An implementation of matrix-matrix multiplication on the Intel KNL processor with AVX-512
Lim, Roktaek
Lee, Yeongha
Kim, Raehyun
Choi, Jaeyoung
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2018, 21 (04): : 1785 - 1795
[4] Fast Multiple-Precision Integer Division Using Intel AVX-512
Edamatsu, Takuya
Takahashi, Daisuke
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2023, 11 (01) : 224 - 236
[5] Transcoding unicode characters with AVX-512 instructions
Clausecker, Robert
Lemire, Daniel
SOFTWARE-PRACTICE & EXPERIENCE, 2023, 53 (12): : 2430 - 2462
[6] An Implementation of Parallel Number-Theoretic Transform Using Intel AVX-512 Instructions
Takahashi, Daisuke
COMPUTER ALGEBRA IN SCIENTIFIC COMPUTING (CASC 2022), 2022, 13366 : 318 - 332
[7] Improving blocked matrix-matrix multiplication routine by utilizing AVX-512 instructions on intel knights landing and xeon scalable processors
Yoosang Park
Raehyun Kim
Thi My Tuyen Nguyen
Jaeyoung Choi
Cluster Computing, 2023, 26 : 2539 - 2549
[8] Improving blocked matrix-matrix multiplication routine by utilizing AVX-512 instructions on intel knights landing and xeon scalable processors
Park, Yoosang
Kim, Raehyun
Nguyen, Thi My Tuyen
Choi, Jaeyoung
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2023, 26 (05): : 2539 - 2549
[9] Impact of AVX-512 Instructions on Graph Partitioning Problems
Hossain, Md Maruf
Saule, Erik
50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOP PROCEEDINGS - ICPP WORKSHOPS '21, 2021,
[10] Enhanced Vector Math Support on the Intel®AVX-512 Architecture
Anderson, Cristina S.
Zhang, Jingwei
Cornea, Marius
2018 IEEE 25TH SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH), 2018, : 120 - 124

← 1 2 3 4 5 →