Acceleration of Large Integer Multiplication with Intel AVX-512 Instructions

被引：3

作者：

Edamatsu, Takuya ^{[1
]}

Takahashi, Daisuke ^{[2
]}

机构：

[1] Univ Tsukuba, Coll Informat Sci, Tsukuba, Ibaraki, Japan

[2] Univ Tsukuba, Ctr Computat Sci, Tsukuba, Ibaraki, Japan

来源：

IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS) | 2018年

关键词：

AVX-512; SIMD; Knights Landing; large integer multiplication; reduced-radix representation;

D O I：

10.1109/HPCC/SmartCity/DSS.2018.00059

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose an implementation of large integer multiplication using Single Instruction Multiple Data (SIMD) instructions. We evaluated the implementation on an Intel Xeon Phi processor. The second generation Intel Xeon Phi processor, Knights Landing, has a set of Advanced Vector Extensions-512 (AVX-512) instructions. Using AVX-512, the processor can handle 512 bits at the same time and has the potential to multiply faster than a processor using Streaming SIMD Extensions (SSE) and AVX. Therefore, we applied AVX-512F (foundation) instructions to the program. In the multiplication of large integers, as the number of digits increases, various processing costs also become larger. One of these costs is carry processing. Therefore, we implemented a multiplication function using a reduced-radix representation and compared the execution time and the number of instructions against the GNU Multiple Precision Arithmetic Library (GMP). Furthermore, we used some optimization techniques for this kernel. We successfully achieved an execution time that was approximately 2.5x faster than GMP on the Knights Landing architecture.

引用

页码：211 / 218

页数：8

共 50 条

[21] Vectorized Parallel Sparse Matrix-Vector Multiplication in PETSc Using AVX-512
Zhang, Hong
Mills, Richard T.
Rupp, Karl
Smith, Barry F.
PROCEEDINGS OF THE 47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2018,
[22] Fair Scheduling for AVX2 and AVX-512 Workloads
Gottschlag, Mathias
Machauer, Philipp
Khalil, Yussuf
Bellosa, Frank
PROCEEDINGS OF THE 2021 USENIX ANNUAL TECHNICAL CONFERENCE, 2021, : 745 - 758
[23] Automatic Core Specialization for AVX-512 Applications
Gottschlag, Mathias
Brantsch, Peter
Bellosa, Frank
PROCEEDINGS OF THE 13TH ACM INTERNATIONAL SYSTEMS AND STORAGE CONFERENCE (SYSTOR 2020), 2020, : 25 - 35
[24] Lightweight Deep Learning Applications on AVX-512
Carneiro, Andre Ramos
Serpa, Matheus S.
Navaux, Philippe O. A.
26TH IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (IEEE ISCC 2021), 2021,
[25] AVX512Crypto: Parallel Implementations of Korean Block Ciphers Using AVX-512
Choi, Yongryeol
Choi, Hojin
Seo, Seog Chung
IEEE ACCESS, 2023, 11 : 55094 - 55106
[26] Optimization of the N-Body Simulation on Intel's Architectures Based on AVX-512 Instruction Set
Rucci, Enzo
Moreno, Ezequiel
Pousa, Adrian
Chichizola, Franco
COMPUTER SCIENCE - CACIC 2019, 2020, 1184 : 37 - 52
[27] Fast Multiple Montgomery Multiplications Using Intel AVX-512IFMA Instructions
Takahashi, Daisuke
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2020, PT V, 2020, 12253 : 655 - 663
[28] SeqMatcher: efficient genome sequence matching with AVX-512 extensions
Espinosa, Elena
Quislant, Ricardo
Larrosa, Rafael
Plata, Oscar
JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
[29] SWIMM 2.0: Enhanced Smith–Waterman on Intel’s Multicore and Manycore Architectures Based on AVX-512 Vector Extensions
Enzo Rucci
Carlos Garcia Sanchez
Guillermo Botella Juan
Armando De Giusti
Marcelo Naiouf
Manuel Prieto-Matias
International Journal of Parallel Programming, 2019, 47 : 296 - 316
[30] SWIMM 2.0: Enhanced Smith-Waterman on Intel's Multicore and Manycore Architectures Based on AVX-512 Vector Extensions
Rucci, Enzo
Garcia Sanchez, Carlos
Botella Juan, Guillermo
De Giusti, Armando
Naiouf, Marcelo
Prieto-Matias, Manuel
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2019, 47 (02) : 296 - 316

← 1 2 3 4 5 →