Optimizing FPGA-Based Convolutional Neural Network Performance

被引：3

作者：

Kao, Chi-Chou ^{[1
]}

机构：

[1] Natl Univ Tainan, Dept Comp Sci & Informat Engn, Tainan 700, Taiwan

来源：

JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS | 2023年 / 32卷 / 15期

关键词：

CNN; FPGA; optimize; performance; architecture;

D O I：

10.1142/S0218126623502547

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In deep learning, convolutional neural networks (CNNs) are a class of artificial neural networks (ANNs), most commonly applied to analyze visual imagery. They are also known as Shift-Invariant or Space-Invariant Artificial Neural Networks (SIANNs), based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation-equivariant responses known as feature maps. Recently, various architectures for CNN based on FPGA platform have been proposed because it has the advantages of high performance and fast development cycle. However, some key issues including how to optimize the performance of CNN layers with different structures, high-performance heterogeneous accelerator design, and how to reduce the neural network framework integration overhead need to be improved. To overcome and improve these problems, we propose dynamic cycle pipeline tiling, data layout optimization, and a pipelined software and hardware (SW-HW)-integrated architecture with flexibility and integration. Some benchmarks have been tested and implemented on the FPGA board for the proposed architecture. The proposed dynamic tiling and data layout transformation improved by 2.3 times in the performance. Moreover, with two-level pipelining, we achieve up to five times speedup and the proposed system is 3.8 times more energy-efficient than the GPU.

引用

页数：19

共 50 条

[41] Optimizing OpenCL Implementation of Deep Convolutional Neural Network on FPGA
Qiao, Yuran
Shen, Junzhong
Huang, Dafei
Yang, Qianming
Wen, Mei
Zhang, Chunyuan
NETWORK AND PARALLEL COMPUTING (NPC 2017), 2017, 10578 : 100 - 111
[42] FPGA-based Acceleration of Neural Network Training
Sang, Ruoyu
Liu, Qiang
Zhang, Qijun
2016 IEEE MTT-S INTERNATIONAL CONFERENCE ON NUMERICAL ELECTROMAGNETIC AND MULTIPHYSICS MODELING AND OPTIMIZATION (NEMO), 2016,
[43] A High Performance FPGA-based Accelerator for Large-Scale Convolutional Neural Networks
Li, Huimin
Fan, Xitian
Jiao, Li
Cao, Wei
Zhou, Xuegong
Wang, Lingli
2016 26TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2016,
[44] FPGA-based Accelerator for Losslessly Quantized Convolutional Neural Networks
Sit, Mankit
Kazami, Ryosuke
Amano, Hideharu
2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 295 - 298
[45] An FPGA-based Accelerator Implementation for Deep Convolutional Neural Networks
Zhou, Yongmei
Jiang, Jingfei
PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 829 - 832
[46] OPU: An FPGA-Based Overlay Processor for Convolutional Neural Networks
Yu, Yunxuan
Wu, Chen
Zhao, Tiandong
Wang, Kun
He, Lei
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (01) : 35 - 47
[47] Composite FPGA-based Accelerator for Deep Convolutional Neural Networks
HuanZhang
YuanYang
YangXiao
2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRON DEVICES AND SOLID-STATE CIRCUITS (EDSSC), 2019,
[48] A FPGA-based Hardware Accelerator for Multiple Convolutional Neural Networks
Yao, Yuchen
Duan, Qinghua
Zhang, Zhiqian
Gao, Jiabao
Wang, Jian
Yang, Meng
Tao, Xinxuan
Lai, Jinmei
2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1075 - 1077
[49] Optimizing Bayesian Recurrent Neural Networks on an FPGA-based Accelerator
Ferianc, Martin
Que, Zhiqiang
Fan, Hongxiang
Luk, Wayne
Rodrigues, Miguel
2021 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT), 2021, : 19 - 28
[50] Optimizing a FPGA-based Neural Accelerator for Small IoT Devices
Hong, Seongmin
Lee, Inho
Park, Yongjun
2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2018, : 176 - 177

← 1 2 3 4 5 →