A Fast Parallel Matrix Inversion Algorithm based on Heterogeneous Multicore Architectures

被引:0
|
作者
Yu, Denggao [1 ]
He, Shiwen [1 ,2 ]
Huang, Yongming [1 ]
Yu, Guangshi [1 ]
Yang, Luxi [1 ]
机构
[1] Southeast Univ, Sch Informat Sci & Engn, Nanjing 210096, Jiangsu, Peoples R China
[2] Southeast Univ, Dept Radio Engn, State Key Lab Millimeter Waves, Nanjing 210096, Jiangsu, Peoples R China
关键词
matrix inversion; high performance computing; software-defined radio; GPU; CUDA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large matrix inversion is usually a basic step in a wide range of signal processing or numerical problems, such as digital filtering, equalization detection, and etc. It is essential to figure out an algorithm to invert large matrix quickly and accurately. On the other hand, the Graphics Processor Unit (GPU) is able to provide a low-cost and flexible multicore architecture for high performance computing, which has attracted many researchers' attention for the building of GPU-based software-defined radio (SDR). In this paper, we propose a fast parallel algorithm for matrix inversion on heterogeneous multicore architectures to utilize the computational power of GPU. Our implementation is based on a modified Squared Givens Rotations (SGR) algorithm, which could adapt to the GPU architecture effectively. The result implemented on Compute Unified Device Architecture (CUDA) obtains a speedup ratio more than 20x versus the CPU-based-only algorithm when the matrix become large, and runs at up to 12.14 gigaflops/s on a graphics processor Geforce GT620 in our implementation.
引用
收藏
页码:903 / 907
页数:5
相关论文
共 50 条
  • [1] PARALLEL PROGRAMMING MODELS FOR HETEROGENEOUS MULTICORE ARCHITECTURES
    Ferrer, Roger
    Bellens, Pieter
    Beltran, Vicenc
    Gonzalez, Marc
    Martorell, Xavier
    Badia, Rosa M.
    Ayguade, Eduard
    Yeom, Jae-Seung
    Schneider, Scott
    Koukos, Konstantinos
    Alvanos, Michail
    Nikolopoulos, Dimitrios S.
    Bilas, Angelos
    IEEE MICRO, 2010, 30 (05) : 42 - 53
  • [2] Triangular Matrix Inversion on Heterogeneous Multicore Systems
    Ries, Florian
    De Marco, Tommaso
    Guerrieri, Roberto
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2012, 23 (01) : 177 - 184
  • [3] High Performance Recursive Matrix Inversion for Multicore Architectures
    Mahfoudhi, Ryma
    Achour, Sami
    Hamdi-Larbi, Olfa
    Mahjoub, Zaher
    2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 675 - 682
  • [4] THE PARALLEL TILED WZ FACTORIZATION ALGORITHM FOR MULTICORE ARCHITECTURES
    Bylina, Beata
    Bylina, Jaroslaw
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2019, 29 (02) : 407 - 419
  • [5] A fast parallel Gauss Jordan algorithm for matrix inversion using CUDA
    Sharma, Girish
    Agarwala, Abhishek
    Bhattacharya, Baidurya
    COMPUTERS & STRUCTURES, 2013, 128 : 31 - 37
  • [6] Energy efficient scheduling algorithm for the multicore heterogeneous embedded architectures
    Anuradha, P.
    Rallapalli, Hemalatha
    Narsimha, G.
    DESIGN AUTOMATION FOR EMBEDDED SYSTEMS, 2018, 22 (1-2) : 1 - 12
  • [7] Energy efficient scheduling algorithm for the multicore heterogeneous embedded architectures
    P. Anuradha
    Hemalatha Rallapalli
    G. Narsimha
    Design Automation for Embedded Systems, 2018, 22 : 1 - 12
  • [8] High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures
    Su, Huayou
    Wen, Mei
    Ren, Ju
    Wu, Nan
    Chai, Jun
    Zhang, Chunyuan
    RADIOENGINEERING, 2012, 21 (01) : 46 - 55
  • [9] A parallel algorithm for advanced video motion estimation on multicore architectures
    Momcilovic, Svetislav
    Sousa, Leonel
    CISIS 2008: THE SECOND INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, PROCEEDINGS, 2008, : 831 - 836
  • [10] JParEnt: Parallel entropy decoding for JPEG decompression on heterogeneous multicore architectures
    Sodsong, Wasuwee
    Jung, Minyoung
    Park, Jinwoo
    Burgstaller, Bernd
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (15):