Implementation and Optimization of the Accelerator Based on FPGA Hardware for LSTM Network

被引:10
|
作者
Zhang, Yiwei [1 ]
Wang, Chao
Gong, Lei
Lu, Yuntao
Sun, Fan
Xu, Chongchong
Li, Xi
Zhou, Xuehai
机构
[1] USTC, Dept Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
10.1109/ISPA/IUCC.2017.00098
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Today, artificial neural networks (ANNs) are important machine learning methods which are widely used in a variety of applications. As the emerging field of ANNs, recurrent neural networks (RNNs) are often used for sequence-related applications. And Long Short-Term Memory (LSTM) is an improved RNN which contains complex computational logic. To achieve high accuracy, researchers always build large-scale LSTM networks which are time-consuming and power-consuming. Thus the acceleration of LSTM networks, low power & energy consumption become the hot issues in today's research. In this paper, we present a hardware accelerator for the LSTM neural network layer based on FPGA Zedboard and use pipeline methods to parallelize the forward computing process. To optimize our implementation, we also use multiple methods including tiled matrix-vector multiplication, binary adder tree, and overlap of computation and data access. Through the acceleration and optimization methods, our accelerator is power-efficient and has a better performance than ARM Cortex A9 processor and Intel Core i5 processor.
引用
收藏
页码:614 / 621
页数:8
相关论文
共 50 条
  • [31] FPGA based Hardware Accelerator for KAZE Feature Extraction Algorithm
    Kalms, Lester
    Elhossini, Ahmed
    Juurlink, Ben
    2016 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), 2016, : 281 - 284
  • [32] An FPGA-Based Hardware Accelerator for Traffic Sign Detection
    Shi, Weijing
    Li, Xin
    Yu, Zhiyi
    Overett, Gary
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2017, 25 (04) : 1362 - 1372
  • [33] FPGA-based hardware implementation of chaotic opposition-based arithmetic optimization algorithm
    Zermani, Mohamed Aymen
    Manita, Ghaith
    Chhabra, Amit
    Feki, Elyes
    Mami, Abdelkader
    APPLIED SOFT COMPUTING, 2024, 154
  • [34] JPEG hardware accelerator design for FPGA
    Duman, Kaan
    Cogun, Zfuat
    Oektem, Levent
    2007 IEEE 15TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS, VOLS 1-3, 2007, : 386 - +
  • [35] FPGA Implementation of LSTM Based on Automatic Speech Recognition
    Li, Chen-Lu
    Huang, Yu-Jie
    Cai, Yu-Jie
    Han, Jun
    Zeng, Xiao-Yang
    2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1258 - 1260
  • [36] An FPGA Implementation of Stochastic Computing-based LSTM
    Maor, Guy
    Zeng, Xiaoming
    Wang, Zhendong
    Hu, Yang
    2019 IEEE 37TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2019), 2019, : 38 - 46
  • [37] EMBRYONIC SYSTEMS IMPLEMENTATION WITH FPGA-BASED ARTIFICIAL CELL NETWORK HARDWARE ARCHITECTURES
    Szasz, Csaba
    Chindris, Virgil
    Husi, Geza
    ASIAN JOURNAL OF CONTROL, 2010, 12 (02) : 208 - 215
  • [38] Deep Neural Network Accelerator based on FPGA
    Thang Viet Huynh
    2017 4TH NAFOSTED CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2017, : 254 - 257
  • [39] A Spiking LSTM Accelerator for Automatic Speech Recognition Application Based on FPGA
    Yin, Tingting
    Dong, Feihong
    Chen, Chao
    Ouyang, Chenghao
    Wang, Zheng
    Yang, Yongkui
    ELECTRONICS, 2024, 13 (05)
  • [40] Real-Time Fixed-Point Hardware Accelerator of Convolutional Neural Network on FPGA Based
    Ozkilbac, Bahadir
    Ozbek, Ibrahim Yucel
    Karacali, Tevhit
    5TH INTERNATIONAL CONFERENCE ON COMPUTING AND INFORMATICS (ICCI 2022), 2022, : 1 - 5