Balancing Computation Loads and Optimizing Input Vector Loading in LSTM Accelerators

被引:5
|
作者
Park, Junki [1 ]
Yi, Wooseok [1 ]
Ahn, Daehyun [1 ]
Kung, Jaeha [2 ]
Kim, Jae-Joon [1 ]
机构
[1] Pohang Univ Sci & Technol, Dept Creat IT Engn, Pohang 37673, South Korea
[2] Daegu Gyeongbuk Inst Sci & Technol, Dept Informat & Commun Engn, Daegu 42988, South Korea
基金
新加坡国家研究基金会;
关键词
Sparse matrices; Logic gates; Hardware; Computer architecture; Clocks; History; Standards; Accelerators; computer architecture; hardware; machine learning; recurrent neural networks (RNNs);
D O I
10.1109/TCAD.2019.2926482
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The long short-term memory (LSTM) is a widely used neural network model for dealing with time-varying data. To reduce the memory requirement, pruning is often applied to the weight matrix of the LSTM, which makes the matrix sparse. In this paper, we present a new sparse matrix format, named rearranged compressed sparse column (RCSC), to maximize the inference speed of the LSTM hardware accelerator. The RCSC format speeds up the inference by: 1) evenly distributing the computation loads to processing elements (PEs) and 2) reducing the input vector load miss within the local buffer. We also propose a hardware architecture adopting hierarchical input buffer to further reduce the pipeline stalls which cannot be handled by the RCSC format alone. The simulation results for various datasets show that combined use of the RSCS format and the proposed hardware requires 2x smaller inference runtime on average compared to the previous work.
引用
收藏
页码:1889 / 1901
页数:13
相关论文
共 13 条
  • [11] Optimizing electric load forecasting with support vector regression/LSTM optimized by flexible Gorilla troops algorithm and neural networks a case study
    Zhang, Zhirong
    Zhang, Qiqi
    Liang, Haitao
    Gorbani, Bizhan
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [12] Optimizing nitrogen input by balancing winter wheat yield and residual nitrate-N in soil in a long-term dryland field experiment in the Loess Plateau of China
    Dai, Jian
    Wang, Zhaohui
    Li, Fucui
    He, Gang
    Wang, Sen
    Li, Qiang
    Cao, Hanbing
    Luo, Laichao
    Zan, Yaling
    Meng, Xiaoyu
    Zhang, Wenwei
    Wang, Ronghui
    Malhi, Sukhdev S.
    FIELD CROPS RESEARCH, 2015, 181 : 32 - 41
  • [13] A novel space vector pulse with modulation (SVPWM) algorithm with direct computation based on the neutral-point balancing problem in a three-level inverter analyzed using a redundant algorithm
    Ali, Mounira
    Talha, Abdelaziz
    Berkouk, El Madjid
    JOURNAL OF ELECTRICAL SYSTEMS, 2018, 14 (02) : 16 - 33