Design of three high-performance concurrent systolic arrays for band matrix multiplication

被引:0
|
作者
Yang, Y [1 ]
Zhao, WQ [1 ]
机构
[1] Fudan Univ, Microelect Dept, ASIC & Syst State Key Lab, Shanghai 200433, Peoples R China
来源
CHINESE JOURNAL OF ELECTRONICS | 2005年 / 14卷 / 04期
关键词
systolic array; band matrix multiplication; operation speed; cell efficiency; parallel operation;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Band matrix multiplication is widely used in the concurrent system. But traditional Kung-Leiserson systolic array for band matrix multiplication cannot realize high cell efficiency because only about 1/3 cells are operated in each step. Thus three alternative designs are presented based on the ideas of "Matrix compression" and "Super pipelined". These new arrays arrange and compress the data matrix skillfully, and add the Processing elements (PE) or readjust the operation sequence to increase the cell efficiency. These changes realize higher cell efficiency and faster operation speed with more intricate architectures. The results show that the best systolic array for band matrix multiplication can use almost 100% processing elements in each step, which is nearly triplication of the traditional Kung-Leiserson system. Also, these modifications increase the operation speed and at best spend only 1/3 processing time to complete the multiplication operation.
引用
收藏
页码:559 / 563
页数:5
相关论文
共 50 条
  • [31] Rapid algorithm for matrix multiplication and its efficient implementation on systolic arrays
    Elfimova, L.D.
    Kapitonova, Yu.V.
    Kibernetika i Sistemnyj Analiz, 2001, (01): : 135 - 151
  • [32] Fault-tolerant high-performance matrix multiplication:: Theory and practice
    Gunnels, JA
    Katz, DS
    Quintana-Ortí, ES
    van de Geijn, RA
    INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2001, : 47 - 56
  • [33] High-performance FIR filter design based on sharing multiplication
    Park, J
    Muhammad, K
    Roy, K
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2003, 11 (02) : 244 - 253
  • [34] Design of a High-Performance Tensor-Vector Multiplication with BLAS
    Bassoy, Cem
    COMPUTATIONAL SCIENCE - ICCS 2019, PT I, 2019, 11536 : 32 - 45
  • [35] DESIGN CONSIDERATIONS OR HIGH-PERFORMANCE AVALANCHE PHOTODIODE MULTIPLICATION LAYERS
    CHANDRAMOULI, V
    MAZIAR, CM
    CAMPBELL, JC
    IEEE TRANSACTIONS ON ELECTRON DEVICES, 1994, 41 (05) : 648 - 654
  • [36] MODULE TO PERFORM MULTIPLICATION, DIVISION, AND SQUARE ROOT IN SYSTOLIC ARRAYS FOR MATRIX COMPUTATIONS
    ERCEGOVAC, MD
    LANG, T
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1991, 11 (03) : 212 - 221
  • [37] Reducing the number of processors elements in systolic arrays for matrix multiplication using linear transformation matrix
    Snopce, Halil
    Elmazi, Lavdrim
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2008, 3 : 486 - 490
  • [38] Two-level pipelined systolic arrays for matrix-vector multiplication
    Milentijevic, IZ
    Milovanovic, IZ
    Milovanovic, EI
    Tosic, MB
    Stojcev, MK
    JOURNAL OF SYSTEMS ARCHITECTURE, 1998, 44 (05) : 383 - 387
  • [39] On the Reliability of Xilinx's Deep Processing Unit and Systolic Arrays for Matrix Multiplication
    Libano, F.
    Rech, P.
    Brunhaver, J.
    2020 20TH EUROPEAN CONFERENCE ON RADIATION AND ITS EFFECTS ON COMPONENTS AND SYSTEMS (RADECS 2020), 2022, : 84 - 88
  • [40] Design Patterns for High-Performance Matrix Computations
    Son, Hoang M.
    MODELING, SIMULATION AND OPTIMIZATION OF COMPLEX PROCESSES, 2008, : 509 - 519