Acceleration of Large Integer Multiplication with Intel AVX-512 Instructions

被引:3
|
作者
Edamatsu, Takuya [1 ]
Takahashi, Daisuke [2 ]
机构
[1] Univ Tsukuba, Coll Informat Sci, Tsukuba, Ibaraki, Japan
[2] Univ Tsukuba, Ctr Computat Sci, Tsukuba, Ibaraki, Japan
关键词
AVX-512; SIMD; Knights Landing; large integer multiplication; reduced-radix representation;
D O I
10.1109/HPCC/SmartCity/DSS.2018.00059
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose an implementation of large integer multiplication using Single Instruction Multiple Data (SIMD) instructions. We evaluated the implementation on an Intel Xeon Phi processor. The second generation Intel Xeon Phi processor, Knights Landing, has a set of Advanced Vector Extensions-512 (AVX-512) instructions. Using AVX-512, the processor can handle 512 bits at the same time and has the potential to multiply faster than a processor using Streaming SIMD Extensions (SSE) and AVX. Therefore, we applied AVX-512F (foundation) instructions to the program. In the multiplication of large integers, as the number of digits increases, various processing costs also become larger. One of these costs is carry processing. Therefore, we implemented a multiplication function using a reduced-radix representation and compared the execution time and the number of instructions against the GNU Multiple Precision Arithmetic Library (GMP). Furthermore, we used some optimization techniques for this kernel. We successfully achieved an execution time that was approximately 2.5x faster than GMP on the Knights Landing architecture.
引用
收藏
页码:211 / 218
页数:8
相关论文
共 50 条
  • [41] Optimization of a sparse grid-based data mining kernel for architectures using AVX-512
    Sarbu, Paul-Cristian
    Bungartz, Hans-Joachim
    2018 30TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2018), 2018, : 364 - 371
  • [42] Fused Table Scans: Combining AVX-512 and JIT to Double the Performance of Multi-Predicate Scans
    Dreseler, Markus
    Kossmann, Jan
    Frohnhofen, Johannes
    Uflacker, Matthias
    Plattner, Hasso
    2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2018, : 102 - 109
  • [43] Conflict Detection-based Run-Length Encoding - AVX-512 CD Instruction Set in Action
    Ungethuem, Annett
    Pietrzyk, Johannes
    Damme, Patrick
    Habich, Dirk
    Lehner, Wolfgang
    2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2018, : 96 - 101
  • [44] SPC5: an efficient SpMV framework vectorized using ARM SVE and x86 AVX-512
    Regnault, Evann
    Bramas, Berenger
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2024, 21 (01) : 203 - 221
  • [45] Effective Implementation of Matrix-Vector Multiplication on Intel's AVX multicore Processor
    Hassan, Somaia A.
    Mahmoud, Mountasser M. M.
    Hemeida, A. M.
    Saber, Mahmoud A.
    COMPUTER LANGUAGES SYSTEMS & STRUCTURES, 2018, 51 : 158 - 175
  • [46] LARGE INTEGER MULTIPLICATION ON HYPERCUBES
    FAGIN, BS
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1992, 14 (04) : 426 - 430
  • [47] An Exploration of Using the Intel AVX2 Gather Load Instructions for Vectorised Image Processing
    Cree, Michael J.
    2018 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2018,
  • [48] Evaluation of Large Integer Multiplication Methods on Hardware
    Rafferty, Ciara
    O'Neill, Maire
    Hanley, Neil
    IEEE TRANSACTIONS ON COMPUTERS, 2017, 66 (08) : 1369 - 1382
  • [49] Faster multiplication over F2[X] sing AVX512 instruction set and VPCLMULQDQ instruction
    Robert, Jean-Marc
    Veron, Pascal
    JOURNAL OF CRYPTOGRAPHIC ENGINEERING, 2023, 13 (01) : 37 - 55
  • [50] Investigating Large Integer Arithmetic on Intel Xeon Phi SIMD Extensions
    Keliris, Anastasis
    Maniatakos, Michail
    2014 9TH IEEE INTERNATIONAL CONFERENCE ON DESIGN & TECHNOLOGY OF INTEGRATED SYSTEMS IN NANOSCALE ERA (DTIS 2014), 2014,