FiC-RNN: A Multi-FPGA Acceleration Framework for Deep Recurrent Neural Networks

被引：12

作者：

Sun, Yuxi ^{[1
]}

Amano, Hideharu ^{[1
]}

机构：

[1] Keio Univ, Dept Informat & Comp Sci, Yokohama, Kanagawa 2238522, Japan

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2020年 / E103D卷 / 12期

关键词：

multi-FPGA; recurrent neural networks; LSTM;

D O I：

10.1587/transinf.2020PAP0003

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recurrent neural networks (RNNs) have been proven effective for sequence-based tasks thanks to their capability to process temporal information. In real-world systems, deep RNNs are more widely used to solve complicated tasks such as large-scale speech recognition and machine translation. However, the implementation of deep RNNs on traditional hardware platforms is inefficient due to long-range temporal dependence and irregular computation patterns within RNNs. This inefficiency manifests itself in the proportional increase in the latency of RNN inference with respect to the number of layers of deep RNNs on CPUs and GPUs. Previous work has focused mostly on optimizing and accelerating individual RNN cells. To make deep RNN inference fast and efficient, we propose an accelerator based on a multi-FPGA platform called Flow-inCloud (FiC). In this work, we show that the parallelism provided by the multi-FPGA system can be taken advantage of to scale up the inference of deep RNNs, by partitioning a large model onto several FPGAs, so that the latency stays close to constant with respect to increasing number of RNN layers. For single-layer and four-layer RNNs, our implementation achieves 31x and 61x speedup compared with an Intel CPU.

引用

页码：2457 / 2462

页数：6

共 50 条

[21] Hardware Acceleration of Deep Neural Networks for Autonomous Driving on FPGA-based SoC
Sciangula, Gerlando
Restuccia, Francesco
Biondi, Alessandro
Buttazzo, Giorgio
2022 25TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2022, : 406 - 414
[22] FPGA-based Acceleration of Deep Neural Networks Using High Level Method
Liu, Lei
Luo, Jianlu
Deng, Xiaoyan
Li, Sikun
2015 10TH INTERNATIONAL CONFERENCE ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING (3PGCIC), 2015, : 824 - 827
[23] Deep Recurrent Neural Network (Deep-RNN) for Classification of Nonlinear Data
Mishra, Debasmita
Naik, Bighnaraj
Sahoo, Ronali Madhusmita
Nayak, Janmenjoy
COMPUTATIONAL INTELLIGENCE IN PATTERN RECOGNITION, CIPR 2020, 2020, 1120 : 207 - 215
[24] An Integrated Circuit Partitioning and TDM Assignment Optimization Framework for Multi-FPGA Systems
Zheng, Dan
Young, Evangeline F. Y.
2023 28TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC, 2023, : 522 - 528
[25] APEIRON: a Framework for High Level Programming of Dataflow Applications on Multi-FPGA Systems
Ammendola, Roberto
Biagioni, Andrea
Chiarini, Carlotta
Ciardiello, Andrea
Cretaro, Paolo
Frezza, Ottorino
Lo Cicero, Francesca
Lonardo, Alessandro
Martinelli, Michele
Paolucci, Pier Stanislao
Pontisso, Luca
Simula, Francesco
Rossi, Cristian
Turisini, Matteo
Vicini, Piero
26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS, CHEP 2023, 2024, 295
[26] MaPart: An Efficient Multi-FPGA System-Aware Hypergraph Partitioning Framework
Li, Benzheng
Bi, Shunyang
You, Hailong
Qi, Zhongdong
Guo, Guangxin
Sun, Richard
Zhang, Yuming
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (10) : 3212 - 3225
[27] FPGA Acceleration of Recurrent Neural Network based Language Model
Li, Sicheng
Wu, Chunpeng
Li, Hai
Li, Boxun
Wang, Yu
Qiu, Qinru
2015 IEEE 23RD ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2015, : 111 - 118
[28] TopoPart: a Multi-level Topology-Driven Partitioning Framework for Multi-FPGA Systems
Zheng, Dan
Zang, Xinshi
Wong, Martin D. F.
2021 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN (ICCAD), 2021,
[29] Hardware accelerators for Recurrent Neural Networks on FPGA
Chang, Andre Xian Ming
Culurciello, Eugenio
2017 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2017, : 2110 - 2113
[30] Acceleration and implementation of convolutional neural networks based on FPGA
Zhao, Sijie
Gao, Shangshang
Wang, Rugang
Wang, Yuanyuan
Zhou, Feng
Guo, Naihong
DIGITAL SIGNAL PROCESSING, 2023, 141

← 1 2 3 4 5 →