Energy Efficient LSTM Accelerators for Embedded FPGAs Through Parameterised Architecture Design

被引:4
|
作者
Qian, Chao [1 ]
Ling, Tianheng [1 ]
Schiele, Gregor [1 ]
机构
[1] Univ Duisburg, Embedded Syst Lab, Duisburg, Germany
关键词
LSTM; Energy Efficiency; Embedded FPGAs;
D O I
10.1007/978-3-031-42785-5_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Long Short-term Memory Networks (LSTMs) are a vital Deep Learning technique suitable for performing on-device time series analysis on local sensor data streams of embedded devices. In this paper, we propose a new hardware accelerator design for LSTMs specially optimised for resource-scarce embedded Field Programmable Gate Arrays (FPGAs). Our design improves the execution speed and reduces energy consumption compared to related work. Moreover, it can be adapted to different situations using a number of optimisation parameters, such as the usage of DSPs or the implementation of activation functions. We present our key design decisions and evaluate the performance. Our accelerator achieves an energy efficiency of 11.89 GOP/s/W during a real-time inference with 32873 samples/s.
引用
收藏
页码:3 / 17
页数:15
相关论文
共 50 条
  • [1] Exploring energy efficiency of LSTM accelerators: A parameterized architecture design for embedded FPGAs
    Qian, Chao
    Ling, Tianheng
    Schiele, Gregor
    JOURNAL OF SYSTEMS ARCHITECTURE, 2024, 152
  • [2] Scalable and parameterised VLSI architecture for efficient sparse approximation in FPGAs and SoCs
    Ren, F.
    Xu, W.
    Markovic, D.
    ELECTRONICS LETTERS, 2013, 49 (23) : 1440 - 1441
  • [3] ENERGY EFFICIENT ARCHITECTURE FOR MATRIX MULTIPLICATION ON FPGAS
    Matam, Kiran Kumar
    Hoang Le
    Prasanna, Viktor K.
    2013 23RD INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL 2013) PROCEEDINGS, 2013,
  • [4] An Efficient Sparse LSTM Accelerator on Embedded FPGAs with Bandwidth-oriented Pruning
    Li, Shiqing
    Zhu, Shien
    Luo, Xiangzhong
    Luo, Tao
    Liu, Weichen
    2023 33RD INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL, 2023, : 42 - 48
  • [5] Energy Efficient Architecture for Graph Analytics Accelerators
    Ozdal, Muhammet Mustafa
    Yesil, Serif
    Kim, Taemin
    Ayupov, Andrey
    Greth, John
    Burns, Steven
    Ozturk, Ozcan
    2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 166 - 177
  • [6] Design flow for embedded FPGAs based on a flexible architecture template
    Neumann, B.
    von Sydow, T.
    Blume, H.
    Noll, T. G.
    2008 DESIGN, AUTOMATION AND TEST IN EUROPE, VOLS 1-3, 2008, : 54 - +
  • [7] Design of OpenCL-Compatible Multithreaded Hardware Accelerators with Dynamic Support for Embedded FPGAs
    Rodriguez, Alfonso
    Valverde, Juan
    de la Torre, Eduardo
    2015 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS (RECONFIG), 2015,
  • [8] Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs
    Yang, Yifan
    Huang, Qijing
    Wu, Bichen
    Zhang, Tianjun
    Ma, Liang
    Gambardella, Giulio
    Blott, Michaela
    Lavagno, Luciano
    Vissers, Kees
    Wawrzynek, John
    Keutzer, Kurt
    PROCEEDINGS OF THE 2019 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'19), 2019, : 23 - 32
  • [9] Enhancing Energy-Efficiency by Solving the Throughput Bottleneck of LSTM Cells for Embedded FPGAs
    Qian, Chao
    Ling, Tianheng
    Schiele, Gregor
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT I, 2023, 1752 : 594 - 605
  • [10] High Throughput Energy Efficient Parallel FFT Architecture on FPGAs
    Chen, Ren
    Park, Neungsoo
    Prasanna, Viktor K.
    2013 IEEE CONFERENCE ON HIGH PERFORMANCE EXTREME COMPUTING (HPEC), 2013,