Optimizing FPGA-Based Convolutional Neural Network Performance

被引:3
|
作者
Kao, Chi-Chou [1 ]
机构
[1] Natl Univ Tainan, Dept Comp Sci & Informat Engn, Tainan 700, Taiwan
关键词
CNN; FPGA; optimize; performance; architecture;
D O I
10.1142/S0218126623502547
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In deep learning, convolutional neural networks (CNNs) are a class of artificial neural networks (ANNs), most commonly applied to analyze visual imagery. They are also known as Shift-Invariant or Space-Invariant Artificial Neural Networks (SIANNs), based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation-equivariant responses known as feature maps. Recently, various architectures for CNN based on FPGA platform have been proposed because it has the advantages of high performance and fast development cycle. However, some key issues including how to optimize the performance of CNN layers with different structures, high-performance heterogeneous accelerator design, and how to reduce the neural network framework integration overhead need to be improved. To overcome and improve these problems, we propose dynamic cycle pipeline tiling, data layout optimization, and a pipelined software and hardware (SW-HW)-integrated architecture with flexibility and integration. Some benchmarks have been tested and implemented on the FPGA board for the proposed architecture. The proposed dynamic tiling and data layout transformation improved by 2.3 times in the performance. Moreover, with two-level pipelining, we achieve up to five times speedup and the proposed system is 3.8 times more energy-efficient than the GPU.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] A survey of FPGA-based accelerators for convolutional neural networks
    Sparsh Mittal
    Neural Computing and Applications, 2020, 32 : 1109 - 1139
  • [32] A survey of FPGA-based accelerators for convolutional neural networks
    Mittal, Sparsh
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (04): : 1109 - 1139
  • [33] A Fast FPGA-based Deep Convolutional Neural Network Using Pseudo Parallel Memories
    Hailesellasie, Muluken
    Hasan, Syed Rafay
    2017 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2017, : 364 - 367
  • [34] A High Utilization FPGA-Based Accelerator for Variable-Scale Convolutional Neural Network
    Li, Xin
    Cai, Yujie
    Han, Jun
    Zeng, Xiaoyang
    2017 IEEE 12TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2017, : 944 - 947
  • [35] FPGA-based Convolutional Neural Network Accelerator design using High Level Synthesize
    Ghaffari, Sina
    Sharifian, Saeed
    2016 2ND INTERNATIONAL CONFERENCE OF SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2016, : 29 - 34
  • [36] FPGA-based parallel implementation to classify Hyperspectral images by using a Convolutional Neural Network
    Baba, Abdullatif
    Bonny, Talal
    INTEGRATION-THE VLSI JOURNAL, 2023, 92 : 15 - 23
  • [37] An Efficient FPGA-based Depthwise Separable Convolutional Neural Network Accelerator with Hardware Pruning
    Liu, Zhengyan
    Liu, Qiang
    Yan, Shun
    Cheung, Ray C. C.
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2024, 17 (01)
  • [38] A Dynamic Reconfigurable Architecture for Hybrid Spiking and Convolutional FPGA-Based Neural Network Designs
    Irmak, Hasan
    Corradi, Federico
    Detterer, Paul
    Alachiotis, Nikolaos
    Ziener, Daniel
    JOURNAL OF LOW POWER ELECTRONICS AND APPLICATIONS, 2021, 11 (03)
  • [39] Performance-oriented FPGA-based convolution neural network designs
    Kao, Chi-Chou
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (14) : 21019 - 21030
  • [40] Performance-oriented FPGA-based convolution neural network designs
    Chi-Chou Kao
    Multimedia Tools and Applications, 2023, 82 : 21019 - 21030