Performance-Driven LSTM Accelerator Hardware Using Split-Matrix-Based MVM

被引:1
|
作者
Joseph, Tresa [1 ]
Bindiya, T. S. [1 ]
机构
[1] Natl Inst Technol Calicut, Dept Elect & Commun Engn, Kattangal 673601, Kerala, India
关键词
Recurrent neural network; Long short-term memory; Systolic array architecture; Parallel computing; RECURRENT NEURAL-NETWORKS;
D O I
10.1007/s00034-023-02412-4
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a new hardware approach for accelerating matrix vector multiplication (MVM) employing systolic array architecture and parallel data processing units, which is particularly useful in multiplication intensive applications such as neural networks. The hardware complexity of the parallel computations is reduced by a technique named as split-matrix approach, in which the larger matrices are split into smaller matrices. In the proposed architecture, 8-bit fixed-point representation is considered and matrices are treated to be circulant in nature. The resulting MVM architecture benefits with reduced implementation complexity in terms of cell area, reduced delay, and power consumption. It is found to result in a 13.9% reduction in logic cell area and a 38.15% reduction in total power consumption when compared to those of the latest baseline design. Also, the proposed architecture is able to achieve a considerably improved minimum permissible clock period of 0.410ns. The development of a long short-term memory (LSTM) architecture using the proposed design also serves to prove the effectiveness of the proposed MVM architecture. The LSTM developed using the proposed MVM provides a 37.57% reduction in the cell area and a 22.86% reduction in the total power in comparison with the latest baseline design and is able to achieve a minimum clock period of 0.42 ns.
引用
收藏
页码:6660 / 6683
页数:24
相关论文
共 50 条
  • [1] Performance-Driven LSTM Accelerator Hardware Using Split-Matrix-Based MVM
    Tresa Joseph
    T. S. Bindiya
    Circuits, Systems, and Signal Processing, 2023, 42 : 6660 - 6683
  • [2] Performance-driven floorplanning technique based on collaboration of software and hardware
    Yoshikawa, Masaya
    Fukui, Masahiro
    Terai, Hidekazu
    2005 IEEE INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS: TECHNOLOGY AND APPLICATIONS, 2005, : 222 - 226
  • [3] Performance-driven optimization for video accelerator design
    Lu, YC
    Shen, CF
    Chen, CK
    Fann, JL
    2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS, 2005, : 4521 - 4524
  • [4] Task scheduling using performance-driven
    Yuan, JB
    Ding, SL
    Ju, JB
    Hu, L
    Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 3899 - 3904
  • [5] Implementation and Optimization of the Accelerator Based on FPGA Hardware for LSTM Network
    Zhang, Yiwei
    Wang, Chao
    Gong, Lei
    Lu, Yuntao
    Sun, Fan
    Xu, Chongchong
    Li, Xi
    Zhou, Xuehai
    2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017), 2017, : 614 - 621
  • [6] PERFORMANCE-DRIVEN PLACEMENT OF CELL BASED ICS
    JACKSON, MAB
    KUH, ES
    26TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, 1989, : 370 - 375
  • [7] Performance-based and performance-driven architectural design and optimization
    Shi, Xing
    FRONTIERS OF STRUCTURAL AND CIVIL ENGINEERING, 2010, 4 (04): : 512 - 518
  • [8] Matrix based signal processing on a reconfigurable hardware accelerator
    Otte, M
    Götze, J
    Bücker, M
    PROCEEDINGS OF THE 2002 IEEE 10TH DIGITAL SIGNAL PROCESSING WORKSHOP & 2ND SIGNAL PROCESSING EDUCATION WORKSHOP, 2002, : 350 - 355
  • [9] FPGA-Based Hardware Accelerator for Matrix Inversion
    Kokkiligadda V.S.K.
    Naikoti V.
    Patkotwar G.S.
    Sabat S.L.
    Peesapati R.
    SN Computer Science, 4 (2)
  • [10] Performance-driven muscle-based facial animation
    Choe, B
    Lee, H
    Ko, HS
    JOURNAL OF VISUALIZATION AND COMPUTER ANIMATION, 2001, 12 (02): : 67 - 79