Balancing Computation Loads and Optimizing Input Vector Loading in LSTM Accelerators

被引:5
|
作者
Park, Junki [1 ]
Yi, Wooseok [1 ]
Ahn, Daehyun [1 ]
Kung, Jaeha [2 ]
Kim, Jae-Joon [1 ]
机构
[1] Pohang Univ Sci & Technol, Dept Creat IT Engn, Pohang 37673, South Korea
[2] Daegu Gyeongbuk Inst Sci & Technol, Dept Informat & Commun Engn, Daegu 42988, South Korea
基金
新加坡国家研究基金会;
关键词
Sparse matrices; Logic gates; Hardware; Computer architecture; Clocks; History; Standards; Accelerators; computer architecture; hardware; machine learning; recurrent neural networks (RNNs);
D O I
10.1109/TCAD.2019.2926482
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The long short-term memory (LSTM) is a widely used neural network model for dealing with time-varying data. To reduce the memory requirement, pruning is often applied to the weight matrix of the LSTM, which makes the matrix sparse. In this paper, we present a new sparse matrix format, named rearranged compressed sparse column (RCSC), to maximize the inference speed of the LSTM hardware accelerator. The RCSC format speeds up the inference by: 1) evenly distributing the computation loads to processing elements (PEs) and 2) reducing the input vector load miss within the local buffer. We also propose a hardware architecture adopting hierarchical input buffer to further reduce the pipeline stalls which cannot be handled by the RCSC format alone. The simulation results for various datasets show that combined use of the RSCS format and the proposed hardware requires 2x smaller inference runtime on average compared to the previous work.
引用
收藏
页码:1889 / 1901
页数:13
相关论文
共 13 条
  • [1] Maximizing System Performance by Balancing Computation Loads in LSTM Accelerators
    Park, Junki
    Kung, Jaeha
    Yi, Wooseok
    Kim, Jae-Joon
    PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 7 - 12
  • [2] Optimizing Load Balancing Routing Mechanisms with Evolutionary Computation
    Pereira, Vitor
    Rocha, Miguel
    Sousa, Pedro
    INTELLIGENT ENVIRONMENTS 2016, 2016, 21 : 298 - 307
  • [3] Balancing Computation and Communication in Distributed Sparse Matrix-Vector Multiplication
    Mi, Hongli
    Yu, Xiangrui
    Yu, Xiaosong
    Wu, Shuangyuan
    Liu, Weifeng
    2023 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID, 2023, : 535 - 544
  • [4] Optimizing Bernoulli Routing Policies for Balancing Loads on Call Centers and Minimizing Transmission Costs
    L. D. Servi
    S. Humair
    Journal of Optimization Theory and Applications, 1999, 100 : 623 - 659
  • [5] Optimizing Bernoulli routing policies for balancing loads on call centers and minimizing transmission costs
    Servi, LD
    Humair, S
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1999, 100 (03) : 623 - 659
  • [6] Optimizing Bernoullli routing policies for balancing loads on call centers and minimizing transmission costs
    Servi, LD
    Humair, S
    PROCEEDINGS OF THE 37TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4, 1998, : 1653 - 1655
  • [7] COMPUTATION OF FEEDBACK VECTOR FOR SINGLE-INPUT LINEAR-MULTIVARIABLE SYSTEMS
    SMITH, JR
    STRINGFELLOW, DC
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 1976, 10 (06) : 1411 - 1414
  • [8] Efficient computation for lower bound dynamic buckling loads of imperfect systems under impact loading
    Wu, BS
    Zhong, HX
    INTERNATIONAL JOURNAL OF NON-LINEAR MECHANICS, 2000, 35 (04) : 735 - 743
  • [9] The input vector space optimization for LSTM deep learning model in real-time prediction of ship motions
    Liu, Yucheng
    Duan, Wenyang
    Huang, Limin
    Duan, Shiliang
    Ma, Xuewen
    OCEAN ENGINEERING, 2020, 213
  • [10] Joint Computation and Traffic Loads Balancing Task Offloading in Multi-Access Edge Computing Systems Interconnected by Elastic Optical Networks
    Xin, Jingjie
    Li, Xin
    Zhang, Lu
    Zhang, Yongjun
    Huang, Shanguo
    IEEE COMMUNICATIONS LETTERS, 2023, 27 (09) : 2378 - 2382