A Fast Parallel Matrix Inversion Algorithm based on Heterogeneous Multicore Architectures

被引:0
|
作者
Yu, Denggao [1 ]
He, Shiwen [1 ,2 ]
Huang, Yongming [1 ]
Yu, Guangshi [1 ]
Yang, Luxi [1 ]
机构
[1] Southeast Univ, Sch Informat Sci & Engn, Nanjing 210096, Jiangsu, Peoples R China
[2] Southeast Univ, Dept Radio Engn, State Key Lab Millimeter Waves, Nanjing 210096, Jiangsu, Peoples R China
关键词
matrix inversion; high performance computing; software-defined radio; GPU; CUDA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large matrix inversion is usually a basic step in a wide range of signal processing or numerical problems, such as digital filtering, equalization detection, and etc. It is essential to figure out an algorithm to invert large matrix quickly and accurately. On the other hand, the Graphics Processor Unit (GPU) is able to provide a low-cost and flexible multicore architecture for high performance computing, which has attracted many researchers' attention for the building of GPU-based software-defined radio (SDR). In this paper, we propose a fast parallel algorithm for matrix inversion on heterogeneous multicore architectures to utilize the computational power of GPU. Our implementation is based on a modified Squared Givens Rotations (SGR) algorithm, which could adapt to the GPU architecture effectively. The result implemented on Compute Unified Device Architecture (CUDA) obtains a speedup ratio more than 20x versus the CPU-based-only algorithm when the matrix become large, and runs at up to 12.14 gigaflops/s on a graphics processor Geforce GT620 in our implementation.
引用
收藏
页码:903 / 907
页数:5
相关论文
共 50 条
  • [41] Block wiedemann algorithm on multicore architectures
    Vialla, Bastien
    ACM Communications in Computer Algebra, 2014, 47 (3-4): : 102 - 103
  • [42] An efficient parallel set container for multicore architectures
    de Vega, Alvaro
    Andrade, Diego
    Fraguela, Basilio B.
    APPLICATIONS, TOOLS AND TECHNIQUES ON THE ROAD TO EXASCALE COMPUTING, 2012, 22 : 369 - 376
  • [43] A recursive partitioning algorithm for matrix inversion on parallel computers
    Ostermark, R
    KYBERNETES, 1998, 27 (4-5) : 496 - +
  • [44] Parallel tiled QR factorization for multicore architectures
    Buttari, Alfredo
    Langou, Julien
    Kurzak, Jakub
    Dongarra, Jack
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2008, 4967 : 639 - +
  • [45] Parallel query processing in databases on multicore architectures
    Acker, Ralph
    Roth, Christian
    Bayer, Rudolf
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, PROCEEDINGS, 2008, 5022 : 2 - +
  • [46] Parallel construction of wavelet trees on multicore architectures
    José Fuentes-Sepúlveda
    Erick Elejalde
    Leo Ferres
    Diego Seco
    Knowledge and Information Systems, 2017, 51 : 1043 - 1066
  • [47] Parallel construction of wavelet trees on multicore architectures
    Fuentes-Sepulveda, Jose
    Elejalde, Erick
    Ferres, Leo
    Seco, Diego
    KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 51 (03) : 1043 - 1066
  • [48] A massively parallel adaptive fast-multipole method on heterogeneous architectures
    Lashuk, Ilya
    Chandramowlishwaran, Aparna
    Langston, Harper
    Tuan-Anh Nguyen
    Sampath, Rahul
    Shringarpure, Aashay
    Vuduc, Richard
    Ying, Lexing
    Zorin, Denis
    Biros, George
    PROCEEDINGS OF THE CONFERENCE ON HIGH PERFORMANCE COMPUTING NETWORKING, STORAGE AND ANALYSIS, 2009,
  • [50] Scalable Hybrid Loop- and Task-Parallel Matrix Inversion for Multicore Processors
    Catalan, Sandra
    Igual, Francisco D.
    Rodriguez-Sanchez, Rafael
    Quintana-Orti, Enrique S.
    2021 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2021, : 679 - 687