Acceleration of Large Integer Multiplication with Intel AVX-512 Instructions

被引:3
|
作者
Edamatsu, Takuya [1 ]
Takahashi, Daisuke [2 ]
机构
[1] Univ Tsukuba, Coll Informat Sci, Tsukuba, Ibaraki, Japan
[2] Univ Tsukuba, Ctr Computat Sci, Tsukuba, Ibaraki, Japan
关键词
AVX-512; SIMD; Knights Landing; large integer multiplication; reduced-radix representation;
D O I
10.1109/HPCC/SmartCity/DSS.2018.00059
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose an implementation of large integer multiplication using Single Instruction Multiple Data (SIMD) instructions. We evaluated the implementation on an Intel Xeon Phi processor. The second generation Intel Xeon Phi processor, Knights Landing, has a set of Advanced Vector Extensions-512 (AVX-512) instructions. Using AVX-512, the processor can handle 512 bits at the same time and has the potential to multiply faster than a processor using Streaming SIMD Extensions (SSE) and AVX. Therefore, we applied AVX-512F (foundation) instructions to the program. In the multiplication of large integers, as the number of digits increases, various processing costs also become larger. One of these costs is carry processing. Therefore, we implemented a multiplication function using a reduced-radix representation and compared the execution time and the number of instructions against the GNU Multiple Precision Arithmetic Library (GMP). Furthermore, we used some optimization techniques for this kernel. We successfully achieved an execution time that was approximately 2.5x faster than GMP on the Knights Landing architecture.
引用
收藏
页码:211 / 218
页数:8
相关论文
共 50 条
  • [31] Acceleration of Homomorphic Unrolled Trace-Type Function using AVX512 instructions
    Inoue, Kotaro
    Suzuki, Takuya
    Yamana, Hayato
    PROCEEDINGS OF THE 10TH WORKSHOP ON ENCRYPTED COMPUTING & APPLIED HOMOMORPHIC CRYPTOGRAPHY, WAHC 2022, 2022, : 47 - 52
  • [33] AVX-512 Based Software Decoding for 5G LDPC Codes
    Xu, Yi
    Wang, Wenjin
    Xu, Then
    Gao, Xiqi
    PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2019), 2019, : 54 - 59
  • [34] Parallel Vectorized Algorithms for Computing Trigonometric Sums Using AVX-512 Extensions
    Stpiczynski, Przemyslaw
    COMPUTATIONAL SCIENCE, ICCS 2024, PT VI, 2024, 14937 : 158 - 172
  • [35] Computing the sparse matrix vector product using block-based kernels without zero padding on processors with AVX-512 instructions
    Bramas, Berenger
    Kus, Pavel
    PEERJ COMPUTER SCIENCE, 2018,
  • [36] Combining Algorithmic Rethinking and AVX-512 Intrinsics for Efficient Simulation of Subcellular Calcium Signaling
    Jarvis, Chad
    Lines, Glenn Terje
    Langguth, Johannes
    Nakajima, Kengo
    Cai, Xing
    COMPUTATIONAL SCIENCE - ICCS 2019, PT V, 2019, 11540 : 681 - 687
  • [37] Acceleration of Particle Swarm Optimization with AVX Instructions
    Safarik, Jakub
    Snasel, Vaclav
    APPLIED SCIENCES-BASEL, 2023, 13 (02):
  • [38] Vectorization of High-performance Scientific Calculations Using AVX-512 Intruction Set
    B. M. Shabanov
    A. A. Rybakov
    S. S. Shumilin
    Lobachevskii Journal of Mathematics, 2019, 40 : 580 - 598
  • [39] Hydrogen-helium chemical and nuclear galaxy collision: Hydrodynamic simulations on AVX-512 supercomputers
    Chernykh, Igor
    Kulikov, Igor
    Tutukov, Alexander
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2021, 391 (391)
  • [40] Vectorization of High-performance Scientific Calculations Using AVX-512 Intruction Set
    Shabanov, B. M.
    Rybakov, A. A.
    Shumilin, S. S.
    LOBACHEVSKII JOURNAL OF MATHEMATICS, 2019, 40 (05) : 580 - 598