Performance-Driven LSTM Accelerator Hardware Using Split-Matrix-Based MVM

被引：1

作者：

Joseph, Tresa ^{[1
]}

Bindiya, T. S. ^{[1
]}

机构：

[1] Natl Inst Technol Calicut, Dept Elect & Commun Engn, Kattangal 673601, Kerala, India

来源：

CIRCUITS SYSTEMS AND SIGNAL PROCESSING | 2023年 / 42卷 / 11期

关键词：

Recurrent neural network; Long short-term memory; Systolic array architecture; Parallel computing; RECURRENT NEURAL-NETWORKS;

D O I：

10.1007/s00034-023-02412-4

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper proposes a new hardware approach for accelerating matrix vector multiplication (MVM) employing systolic array architecture and parallel data processing units, which is particularly useful in multiplication intensive applications such as neural networks. The hardware complexity of the parallel computations is reduced by a technique named as split-matrix approach, in which the larger matrices are split into smaller matrices. In the proposed architecture, 8-bit fixed-point representation is considered and matrices are treated to be circulant in nature. The resulting MVM architecture benefits with reduced implementation complexity in terms of cell area, reduced delay, and power consumption. It is found to result in a 13.9% reduction in logic cell area and a 38.15% reduction in total power consumption when compared to those of the latest baseline design. Also, the proposed architecture is able to achieve a considerably improved minimum permissible clock period of 0.410ns. The development of a long short-term memory (LSTM) architecture using the proposed design also serves to prove the effectiveness of the proposed MVM architecture. The LSTM developed using the proposed MVM provides a 37.57% reduction in the cell area and a 22.86% reduction in the total power in comparison with the latest baseline design and is able to achieve a minimum clock period of 0.42 ns.

引用

页码：6660 / 6683

页数：24

共 50 条

[41] Performance-Driven Time-Adaptive Stochastic Unit Commitment Based on Neural Network
Zhang, Wenwen
Qiu, Gao
Gao, Hongjun
Li, Yaping
Yang, Shengchun
Yan, Jiahao
Mao, Wenbo
Liu, Junyong
IEEE TRANSACTIONS ON POWER SYSTEMS, 2024, 39 (06) : 7453 - 7456
[42] HARP: Hardware-Based Pseudo-Tiling for Sparse Matrix Multiplication Accelerator
Kim, Jinkwon
Jang, Myeongjae
Nam, Haejin
Kim, Soontae
56TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO 2023, 2023, : 1148 - 1162
[43] Performance-driven contractor recommendation system using a weighted activity-contractor network
Mostofi, Fatemeh
Tokdemir, Onur Behzat
Bahadir, Uemit
Togan, Vedat
COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2025, 40 (03) : 409 - 424
[44] Performance-driven board-level routing for FPGA-based logic emulation
Mak, WK
Wong, DF
INTERNATIONAL CONFERENCE ON COMPUTER DESIGN: VLSI IN COMPUTERS AND PROCESSORS, PROCEEDINGS, 1998, : 199 - 201
[45] BAT: Performance-Driven Crosstalk Mitigation Based on Bus-Grouping Asynchronous Transmission
Yan, Guihai
Han, Yinhe
Li, Xiaowei
Liu, Hui
IEICE TRANSACTIONS ON ELECTRONICS, 2008, E91C (10): : 1690 - 1697
[46] Performance-driven Control Approach for Uncertain Nonlinear Systems Based on Positive System Analysis
Guo, Zong-Yi
Han, Yong-Lin
Guo, Jian-Guo
Hu, Guan-Jie
Zidonghua Xuebao/Acta Automatica Sinica, 2025, 51 (01): : 133 - 143
[47] Development of the Performance-Driven Part Library of Cylindrical Spiral Spring Based on Pro/E
Jiang, Du
Yan, Cao
PRECISION ENGINEERING AND NON-TRADITIONAL MACHINING, 2012, 411 : 365 - 369
[48] Performance-driven PID control based upon discrete-time IMC tuning
Kinoshita, Takuya, 1600, Institute of Electrical Engineers of Japan (134):
[49] Simulation and redesign of industrial high-speed sewing machine based on performance-driven
Che Junhua
Zeng Qian
Sun ZhenQiang
Wang Pingzhang
MATERIALS AND MANUFACTURING, PTS 1 AND 2, 2011, 299-300 : 895 - 898
[50] A low-latency LSTM accelerator using balanced sparsity based on FPGA
Jiang, Jingfei
Xiao, Tao
Xu, Jinwei
Wen, Dong
Gao, Lei
Dou, Yong
MICROPROCESSORS AND MICROSYSTEMS, 2022, 89

← 1 2 3 4 5 →