FiC-RNN: A Multi-FPGA Acceleration Framework for Deep Recurrent Neural Networks

被引:12
|
作者
Sun, Yuxi [1 ]
Amano, Hideharu [1 ]
机构
[1] Keio Univ, Dept Informat & Comp Sci, Yokohama, Kanagawa 2238522, Japan
关键词
multi-FPGA; recurrent neural networks; LSTM;
D O I
10.1587/transinf.2020PAP0003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recurrent neural networks (RNNs) have been proven effective for sequence-based tasks thanks to their capability to process temporal information. In real-world systems, deep RNNs are more widely used to solve complicated tasks such as large-scale speech recognition and machine translation. However, the implementation of deep RNNs on traditional hardware platforms is inefficient due to long-range temporal dependence and irregular computation patterns within RNNs. This inefficiency manifests itself in the proportional increase in the latency of RNN inference with respect to the number of layers of deep RNNs on CPUs and GPUs. Previous work has focused mostly on optimizing and accelerating individual RNN cells. To make deep RNN inference fast and efficient, we propose an accelerator based on a multi-FPGA platform called Flow-inCloud (FiC). In this work, we show that the parallelism provided by the multi-FPGA system can be taken advantage of to scale up the inference of deep RNNs, by partitioning a large model onto several FPGAs, so that the latency stays close to constant with respect to increasing number of RNN layers. For single-layer and four-layer RNNs, our implementation achieves 31x and 61x speedup compared with an Intel CPU.
引用
收藏
页码:2457 / 2462
页数:6
相关论文
共 50 条
  • [21] Hardware Acceleration of Deep Neural Networks for Autonomous Driving on FPGA-based SoC
    Sciangula, Gerlando
    Restuccia, Francesco
    Biondi, Alessandro
    Buttazzo, Giorgio
    2022 25TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2022, : 406 - 414
  • [22] FPGA-based Acceleration of Deep Neural Networks Using High Level Method
    Liu, Lei
    Luo, Jianlu
    Deng, Xiaoyan
    Li, Sikun
    2015 10TH INTERNATIONAL CONFERENCE ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING (3PGCIC), 2015, : 824 - 827
  • [23] Deep Recurrent Neural Network (Deep-RNN) for Classification of Nonlinear Data
    Mishra, Debasmita
    Naik, Bighnaraj
    Sahoo, Ronali Madhusmita
    Nayak, Janmenjoy
    COMPUTATIONAL INTELLIGENCE IN PATTERN RECOGNITION, CIPR 2020, 2020, 1120 : 207 - 215
  • [24] An Integrated Circuit Partitioning and TDM Assignment Optimization Framework for Multi-FPGA Systems
    Zheng, Dan
    Young, Evangeline F. Y.
    2023 28TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC, 2023, : 522 - 528
  • [25] APEIRON: a Framework for High Level Programming of Dataflow Applications on Multi-FPGA Systems
    Ammendola, Roberto
    Biagioni, Andrea
    Chiarini, Carlotta
    Ciardiello, Andrea
    Cretaro, Paolo
    Frezza, Ottorino
    Lo Cicero, Francesca
    Lonardo, Alessandro
    Martinelli, Michele
    Paolucci, Pier Stanislao
    Pontisso, Luca
    Simula, Francesco
    Rossi, Cristian
    Turisini, Matteo
    Vicini, Piero
    26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS, CHEP 2023, 2024, 295
  • [26] MaPart: An Efficient Multi-FPGA System-Aware Hypergraph Partitioning Framework
    Li, Benzheng
    Bi, Shunyang
    You, Hailong
    Qi, Zhongdong
    Guo, Guangxin
    Sun, Richard
    Zhang, Yuming
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (10) : 3212 - 3225
  • [27] FPGA Acceleration of Recurrent Neural Network based Language Model
    Li, Sicheng
    Wu, Chunpeng
    Li, Hai
    Li, Boxun
    Wang, Yu
    Qiu, Qinru
    2015 IEEE 23RD ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2015, : 111 - 118
  • [28] TopoPart: a Multi-level Topology-Driven Partitioning Framework for Multi-FPGA Systems
    Zheng, Dan
    Zang, Xinshi
    Wong, Martin D. F.
    2021 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN (ICCAD), 2021,
  • [29] Hardware accelerators for Recurrent Neural Networks on FPGA
    Chang, Andre Xian Ming
    Culurciello, Eugenio
    2017 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2017, : 2110 - 2113
  • [30] Acceleration and implementation of convolutional neural networks based on FPGA
    Zhao, Sijie
    Gao, Shangshang
    Wang, Rugang
    Wang, Yuanyuan
    Zhou, Feng
    Guo, Naihong
    DIGITAL SIGNAL PROCESSING, 2023, 141