Optimizing the parallel computation of linear recurrences using compact matrix representations

被引：4

作者：

Nistor, Adrian ^{[2
]}

Chin, Wei-Ngan ^{[1
]}

Tan, Tiow-Seng ^{[1
]}

Tapus, Nicolae ^{[2
]}

机构：

[1] Natl Univ Singapore, Dept Comp Sci, Singapore 117548, Singapore

[2] Univ Politehn Bucuresti, Dept Comp Sci, Bucharest, Romania

来源：

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING | 2009年 / 69卷 / 04期

关键词：

Memory optimization; Recursion; Matrix multiplication; Programmable graphics hardware; GRAPHICS HARDWARE; ALGORITHMS;

D O I：

10.1016/j.jpdc.2009.01.004

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper presents a novel method for optimizing the parallel computation of linear recurrences. Our method can help reduce the resource requirements for both memory and computation. A unique feature of our technique is its formulation of linear recurrences as matrix computations, before exploiting their mathematical properties for more compact representations. Based on a general notion of closure for matrix multiplication, we present two classes of matrices that have compact representations. These classes are permutation matrices and matrices whose elements are linearly related to each other. To validate the proposed method, we experiment with solving recurrences whose matrices have compact representations using CUDA on nVidia GeForce 8800 GTX GPU. The advantages of our technique are that it enables the computation of larger recurrences in parallel and it provides good speedups of up to eleven times over the un-optimized parallel computations. Also, the memory usage can be as much as nine times lower than that of the un-optimized parallel computations. Our result confirms a promising approach for the adoption of more advanced parallelization techniques. (C) 2009 Elsevier Inc. All rights reserved.

引用

页码：373 / 381

页数：9

共 50 条

[41] A data management layer for parallel matrix computation
Burdeniuk, Adam
To, Kiet
Lim, Cheng Chew
SMART STRUCTURES, DEVICES, AND SYSTEMS III, 2007, 6414
[42] The block parallel computation of matrix tensor production
Tan, Guolv
DCABES 2007 Proceedings, Vols I and II, 2007, : 70 - 73
[43] Conditions for Existence, Representations, and Computation of Matrix Generalized Inverses
Stanimirovic, Predrag S.
Ciric, Miroslav
Stojanovic, Igor
Gerontitis, Dimitrios
COMPLEXITY, 2017,
[44] HIGHLY PARALLEL REPRESENTATIONS FOR LINEAR-MAPS
ARAVENA, JL
PORTER, WA
IEE PROCEEDINGS-E COMPUTERS AND DIGITAL TECHNIQUES, 1992, 139 (03): : 173 - 178
[45] EFFICIENT PARALLEL ALGORITHMS FOR LINEAR RECURRENCE COMPUTATION
GREENBERG, AC
LADNER, RE
PATERSON, MS
GALIL, Z
INFORMATION PROCESSING LETTERS, 1982, 15 (01) : 31 - 35
[46] Parallel computation of transverse wakes in linear colliders
Zhan, XW
Ko, K
COMPUTATIONAL ACCELERATOR PHYSICS, 1997, (391): : 389 - 394
[47] Optimizing Parallel Graph Connectivity Computation via Subgraph Sampling
Sutton, Michael
Ben-Nun, Tal
Barak, Amnon
2018 32ND IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2018, : 12 - 21
[48] REPRESENTATIONS OVER A FIELD OF LINEAR RECURRENCES OF MAXIMAL PERIOD OVER A QUOTIENT RING
KURAKIN, VL
RUSSIAN MATHEMATICAL SURVEYS, 1994, 49 (02) : 163 - 165
[49] Special lattice computation for the CERN Compact Linear Collider
d'Amico, ET
Guignard, G
PHYSICAL REVIEW SPECIAL TOPICS-ACCELERATORS AND BEAMS, 2001, 4 (02): : 17 - 23
[50] Parallel Boolean Matrix Multiplication in Linear Time using Rectifying Memristors
Velasquez, Alvaro
Jha, Sumit Kumar
2016 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2016, : 1874 - 1877

← 1 2 3 4 5 →