Acceleration of Large Integer Multiplication with Intel AVX-512 Instructions

被引:3
|
作者
Edamatsu, Takuya [1 ]
Takahashi, Daisuke [2 ]
机构
[1] Univ Tsukuba, Coll Informat Sci, Tsukuba, Ibaraki, Japan
[2] Univ Tsukuba, Ctr Computat Sci, Tsukuba, Ibaraki, Japan
关键词
AVX-512; SIMD; Knights Landing; large integer multiplication; reduced-radix representation;
D O I
10.1109/HPCC/SmartCity/DSS.2018.00059
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose an implementation of large integer multiplication using Single Instruction Multiple Data (SIMD) instructions. We evaluated the implementation on an Intel Xeon Phi processor. The second generation Intel Xeon Phi processor, Knights Landing, has a set of Advanced Vector Extensions-512 (AVX-512) instructions. Using AVX-512, the processor can handle 512 bits at the same time and has the potential to multiply faster than a processor using Streaming SIMD Extensions (SSE) and AVX. Therefore, we applied AVX-512F (foundation) instructions to the program. In the multiplication of large integers, as the number of digits increases, various processing costs also become larger. One of these costs is carry processing. Therefore, we implemented a multiplication function using a reduced-radix representation and compared the execution time and the number of instructions against the GNU Multiple Precision Arithmetic Library (GMP). Furthermore, we used some optimization techniques for this kernel. We successfully achieved an execution time that was approximately 2.5x faster than GMP on the Knights Landing architecture.
引用
收藏
页码:211 / 218
页数:8
相关论文
共 50 条
  • [21] Vectorized Parallel Sparse Matrix-Vector Multiplication in PETSc Using AVX-512
    Zhang, Hong
    Mills, Richard T.
    Rupp, Karl
    Smith, Barry F.
    PROCEEDINGS OF THE 47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2018,
  • [22] Fair Scheduling for AVX2 and AVX-512 Workloads
    Gottschlag, Mathias
    Machauer, Philipp
    Khalil, Yussuf
    Bellosa, Frank
    PROCEEDINGS OF THE 2021 USENIX ANNUAL TECHNICAL CONFERENCE, 2021, : 745 - 758
  • [23] Automatic Core Specialization for AVX-512 Applications
    Gottschlag, Mathias
    Brantsch, Peter
    Bellosa, Frank
    PROCEEDINGS OF THE 13TH ACM INTERNATIONAL SYSTEMS AND STORAGE CONFERENCE (SYSTOR 2020), 2020, : 25 - 35
  • [24] Lightweight Deep Learning Applications on AVX-512
    Carneiro, Andre Ramos
    Serpa, Matheus S.
    Navaux, Philippe O. A.
    26TH IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (IEEE ISCC 2021), 2021,
  • [25] AVX512Crypto: Parallel Implementations of Korean Block Ciphers Using AVX-512
    Choi, Yongryeol
    Choi, Hojin
    Seo, Seog Chung
    IEEE ACCESS, 2023, 11 : 55094 - 55106
  • [26] Optimization of the N-Body Simulation on Intel's Architectures Based on AVX-512 Instruction Set
    Rucci, Enzo
    Moreno, Ezequiel
    Pousa, Adrian
    Chichizola, Franco
    COMPUTER SCIENCE - CACIC 2019, 2020, 1184 : 37 - 52
  • [27] Fast Multiple Montgomery Multiplications Using Intel AVX-512IFMA Instructions
    Takahashi, Daisuke
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2020, PT V, 2020, 12253 : 655 - 663
  • [28] SeqMatcher: efficient genome sequence matching with AVX-512 extensions
    Espinosa, Elena
    Quislant, Ricardo
    Larrosa, Rafael
    Plata, Oscar
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [29] SWIMM 2.0: Enhanced Smith–Waterman on Intel’s Multicore and Manycore Architectures Based on AVX-512 Vector Extensions
    Enzo Rucci
    Carlos Garcia Sanchez
    Guillermo Botella Juan
    Armando De Giusti
    Marcelo Naiouf
    Manuel Prieto-Matias
    International Journal of Parallel Programming, 2019, 47 : 296 - 316
  • [30] SWIMM 2.0: Enhanced Smith-Waterman on Intel's Multicore and Manycore Architectures Based on AVX-512 Vector Extensions
    Rucci, Enzo
    Garcia Sanchez, Carlos
    Botella Juan, Guillermo
    De Giusti, Armando
    Naiouf, Marcelo
    Prieto-Matias, Manuel
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2019, 47 (02) : 296 - 316