Optimizing FPGA-Based Convolutional Neural Network Performance

被引:3
|
作者
Kao, Chi-Chou [1 ]
机构
[1] Natl Univ Tainan, Dept Comp Sci & Informat Engn, Tainan 700, Taiwan
关键词
CNN; FPGA; optimize; performance; architecture;
D O I
10.1142/S0218126623502547
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In deep learning, convolutional neural networks (CNNs) are a class of artificial neural networks (ANNs), most commonly applied to analyze visual imagery. They are also known as Shift-Invariant or Space-Invariant Artificial Neural Networks (SIANNs), based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation-equivariant responses known as feature maps. Recently, various architectures for CNN based on FPGA platform have been proposed because it has the advantages of high performance and fast development cycle. However, some key issues including how to optimize the performance of CNN layers with different structures, high-performance heterogeneous accelerator design, and how to reduce the neural network framework integration overhead need to be improved. To overcome and improve these problems, we propose dynamic cycle pipeline tiling, data layout optimization, and a pipelined software and hardware (SW-HW)-integrated architecture with flexibility and integration. Some benchmarks have been tested and implemented on the FPGA board for the proposed architecture. The proposed dynamic tiling and data layout transformation improved by 2.3 times in the performance. Moreover, with two-level pipelining, we achieve up to five times speedup and the proposed system is 3.8 times more energy-efficient than the GPU.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] Optimizing OpenCL Implementation of Deep Convolutional Neural Network on FPGA
    Qiao, Yuran
    Shen, Junzhong
    Huang, Dafei
    Yang, Qianming
    Wen, Mei
    Zhang, Chunyuan
    NETWORK AND PARALLEL COMPUTING (NPC 2017), 2017, 10578 : 100 - 111
  • [42] FPGA-based Acceleration of Neural Network Training
    Sang, Ruoyu
    Liu, Qiang
    Zhang, Qijun
    2016 IEEE MTT-S INTERNATIONAL CONFERENCE ON NUMERICAL ELECTROMAGNETIC AND MULTIPHYSICS MODELING AND OPTIMIZATION (NEMO), 2016,
  • [43] A High Performance FPGA-based Accelerator for Large-Scale Convolutional Neural Networks
    Li, Huimin
    Fan, Xitian
    Jiao, Li
    Cao, Wei
    Zhou, Xuegong
    Wang, Lingli
    2016 26TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2016,
  • [44] FPGA-based Accelerator for Losslessly Quantized Convolutional Neural Networks
    Sit, Mankit
    Kazami, Ryosuke
    Amano, Hideharu
    2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 295 - 298
  • [45] An FPGA-based Accelerator Implementation for Deep Convolutional Neural Networks
    Zhou, Yongmei
    Jiang, Jingfei
    PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 829 - 832
  • [46] OPU: An FPGA-Based Overlay Processor for Convolutional Neural Networks
    Yu, Yunxuan
    Wu, Chen
    Zhao, Tiandong
    Wang, Kun
    He, Lei
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (01) : 35 - 47
  • [47] Composite FPGA-based Accelerator for Deep Convolutional Neural Networks
    HuanZhang
    YuanYang
    YangXiao
    2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRON DEVICES AND SOLID-STATE CIRCUITS (EDSSC), 2019,
  • [48] A FPGA-based Hardware Accelerator for Multiple Convolutional Neural Networks
    Yao, Yuchen
    Duan, Qinghua
    Zhang, Zhiqian
    Gao, Jiabao
    Wang, Jian
    Yang, Meng
    Tao, Xinxuan
    Lai, Jinmei
    2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1075 - 1077
  • [49] Optimizing Bayesian Recurrent Neural Networks on an FPGA-based Accelerator
    Ferianc, Martin
    Que, Zhiqiang
    Fan, Hongxiang
    Luk, Wayne
    Rodrigues, Miguel
    2021 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT), 2021, : 19 - 28
  • [50] Optimizing a FPGA-based Neural Accelerator for Small IoT Devices
    Hong, Seongmin
    Lee, Inho
    Park, Yongjun
    2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2018, : 176 - 177