Design of three high-performance concurrent systolic arrays for band matrix multiplication

被引：0

作者：

Yang, Y ^{[1
]}

Zhao, WQ ^{[1
]}

机构：

[1] Fudan Univ, Microelect Dept, ASIC & Syst State Key Lab, Shanghai 200433, Peoples R China

来源：

CHINESE JOURNAL OF ELECTRONICS | 2005年 / 14卷 / 04期

关键词：

systolic array; band matrix multiplication; operation speed; cell efficiency; parallel operation;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Band matrix multiplication is widely used in the concurrent system. But traditional Kung-Leiserson systolic array for band matrix multiplication cannot realize high cell efficiency because only about 1/3 cells are operated in each step. Thus three alternative designs are presented based on the ideas of "Matrix compression" and "Super pipelined". These new arrays arrange and compress the data matrix skillfully, and add the Processing elements (PE) or readjust the operation sequence to increase the cell efficiency. These changes realize higher cell efficiency and faster operation speed with more intricate architectures. The results show that the best systolic array for band matrix multiplication can use almost 100% processing elements in each step, which is nearly triplication of the traditional Kung-Leiserson system. Also, these modifications increase the operation speed and at best spend only 1/3 processing time to complete the multiplication operation.

引用

页码：559 / 563

页数：5

共 50 条

[31] Rapid algorithm for matrix multiplication and its efficient implementation on systolic arrays
Elfimova, L.D.
Kapitonova, Yu.V.
Kibernetika i Sistemnyj Analiz, 2001, (01): : 135 - 151
[32] Fault-tolerant high-performance matrix multiplication:: Theory and practice
Gunnels, JA
Katz, DS
Quintana-Ortí, ES
van de Geijn, RA
INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2001, : 47 - 56
[33] High-performance FIR filter design based on sharing multiplication
Park, J
Muhammad, K
Roy, K
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2003, 11 (02) : 244 - 253
[34] Design of a High-Performance Tensor-Vector Multiplication with BLAS
Bassoy, Cem
COMPUTATIONAL SCIENCE - ICCS 2019, PT I, 2019, 11536 : 32 - 45
[35] DESIGN CONSIDERATIONS OR HIGH-PERFORMANCE AVALANCHE PHOTODIODE MULTIPLICATION LAYERS
CHANDRAMOULI, V
MAZIAR, CM
CAMPBELL, JC
IEEE TRANSACTIONS ON ELECTRON DEVICES, 1994, 41 (05) : 648 - 654
[36] MODULE TO PERFORM MULTIPLICATION, DIVISION, AND SQUARE ROOT IN SYSTOLIC ARRAYS FOR MATRIX COMPUTATIONS
ERCEGOVAC, MD
LANG, T
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1991, 11 (03) : 212 - 221
[37] Reducing the number of processors elements in systolic arrays for matrix multiplication using linear transformation matrix
Snopce, Halil
Elmazi, Lavdrim
INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2008, 3 : 486 - 490
[38] Two-level pipelined systolic arrays for matrix-vector multiplication
Milentijevic, IZ
Milovanovic, IZ
Milovanovic, EI
Tosic, MB
Stojcev, MK
JOURNAL OF SYSTEMS ARCHITECTURE, 1998, 44 (05) : 383 - 387
[39] On the Reliability of Xilinx's Deep Processing Unit and Systolic Arrays for Matrix Multiplication
Libano, F.
Rech, P.
Brunhaver, J.
2020 20TH EUROPEAN CONFERENCE ON RADIATION AND ITS EFFECTS ON COMPONENTS AND SYSTEMS (RADECS 2020), 2022, : 84 - 88
[40] Design Patterns for High-Performance Matrix Computations
Son, Hoang M.
MODELING, SIMULATION AND OPTIMIZATION OF COMPLEX PROCESSES, 2008, : 509 - 519

← 1 2 3 4 5 →