A Fast Parallel Matrix Inversion Algorithm based on Heterogeneous Multicore Architectures

被引：0

作者：

Yu, Denggao ^{[1
]}

He, Shiwen ^{[1
,2
]}

Huang, Yongming ^{[1
]}

Yu, Guangshi ^{[1
]}

Yang, Luxi ^{[1
]}

机构：

[1] Southeast Univ, Sch Informat Sci & Engn, Nanjing 210096, Jiangsu, Peoples R China

[2] Southeast Univ, Dept Radio Engn, State Key Lab Millimeter Waves, Nanjing 210096, Jiangsu, Peoples R China

来源：

2015 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP) | 2015年

关键词：

matrix inversion; high performance computing; software-defined radio; GPU; CUDA;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large matrix inversion is usually a basic step in a wide range of signal processing or numerical problems, such as digital filtering, equalization detection, and etc. It is essential to figure out an algorithm to invert large matrix quickly and accurately. On the other hand, the Graphics Processor Unit (GPU) is able to provide a low-cost and flexible multicore architecture for high performance computing, which has attracted many researchers' attention for the building of GPU-based software-defined radio (SDR). In this paper, we propose a fast parallel algorithm for matrix inversion on heterogeneous multicore architectures to utilize the computational power of GPU. Our implementation is based on a modified Squared Givens Rotations (SGR) algorithm, which could adapt to the GPU architecture effectively. The result implemented on Compute Unified Device Architecture (CUDA) obtains a speedup ratio more than 20x versus the CPU-based-only algorithm when the matrix become large, and runs at up to 12.14 gigaflops/s on a graphics processor Geforce GT620 in our implementation.

引用

页码：903 / 907

页数：5

共 50 条

[1] PARALLEL PROGRAMMING MODELS FOR HETEROGENEOUS MULTICORE ARCHITECTURES
Ferrer, Roger
Bellens, Pieter
Beltran, Vicenc
Gonzalez, Marc
Martorell, Xavier
Badia, Rosa M.
Ayguade, Eduard
Yeom, Jae-Seung
Schneider, Scott
Koukos, Konstantinos
Alvanos, Michail
Nikolopoulos, Dimitrios S.
Bilas, Angelos
IEEE MICRO, 2010, 30 (05) : 42 - 53
[2] Triangular Matrix Inversion on Heterogeneous Multicore Systems
Ries, Florian
De Marco, Tommaso
Guerrieri, Roberto
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2012, 23 (01) : 177 - 184
[3] High Performance Recursive Matrix Inversion for Multicore Architectures
Mahfoudhi, Ryma
Achour, Sami
Hamdi-Larbi, Olfa
Mahjoub, Zaher
2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 675 - 682
[4] THE PARALLEL TILED WZ FACTORIZATION ALGORITHM FOR MULTICORE ARCHITECTURES
Bylina, Beata
Bylina, Jaroslaw
INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2019, 29 (02) : 407 - 419
[5] A fast parallel Gauss Jordan algorithm for matrix inversion using CUDA
Sharma, Girish
Agarwala, Abhishek
Bhattacharya, Baidurya
COMPUTERS & STRUCTURES, 2013, 128 : 31 - 37
[6] Energy efficient scheduling algorithm for the multicore heterogeneous embedded architectures
Anuradha, P.
Rallapalli, Hemalatha
Narsimha, G.
DESIGN AUTOMATION FOR EMBEDDED SYSTEMS, 2018, 22 (1-2) : 1 - 12
[7] Energy efficient scheduling algorithm for the multicore heterogeneous embedded architectures
P. Anuradha
Hemalatha Rallapalli
G. Narsimha
Design Automation for Embedded Systems, 2018, 22 : 1 - 12
[8] High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures
Su, Huayou
Wen, Mei
Ren, Ju
Wu, Nan
Chai, Jun
Zhang, Chunyuan
RADIOENGINEERING, 2012, 21 (01) : 46 - 55
[9] A parallel algorithm for advanced video motion estimation on multicore architectures
Momcilovic, Svetislav
Sousa, Leonel
CISIS 2008: THE SECOND INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, PROCEEDINGS, 2008, : 831 - 836
[10] JParEnt: Parallel entropy decoding for JPEG decompression on heterogeneous multicore architectures
Sodsong, Wasuwee
Jung, Minyoung
Park, Jinwoo
Burgstaller, Bernd
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (15):

← 1 2 3 4 5 →