Optimizing FPGA-Based Convolutional Neural Network Performance

被引:3
|
作者
Kao, Chi-Chou [1 ]
机构
[1] Natl Univ Tainan, Dept Comp Sci & Informat Engn, Tainan 700, Taiwan
关键词
CNN; FPGA; optimize; performance; architecture;
D O I
10.1142/S0218126623502547
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In deep learning, convolutional neural networks (CNNs) are a class of artificial neural networks (ANNs), most commonly applied to analyze visual imagery. They are also known as Shift-Invariant or Space-Invariant Artificial Neural Networks (SIANNs), based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation-equivariant responses known as feature maps. Recently, various architectures for CNN based on FPGA platform have been proposed because it has the advantages of high performance and fast development cycle. However, some key issues including how to optimize the performance of CNN layers with different structures, high-performance heterogeneous accelerator design, and how to reduce the neural network framework integration overhead need to be improved. To overcome and improve these problems, we propose dynamic cycle pipeline tiling, data layout optimization, and a pipelined software and hardware (SW-HW)-integrated architecture with flexibility and integration. Some benchmarks have been tested and implemented on the FPGA board for the proposed architecture. The proposed dynamic tiling and data layout transformation improved by 2.3 times in the performance. Moreover, with two-level pipelining, we achieve up to five times speedup and the proposed system is 3.8 times more energy-efficient than the GPU.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Implementation of Data-optimized FPGA-based Accelerator for Convolutional Neural Network
    Cho, Mannhee
    Kim, Youngmin
    2020 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2020,
  • [22] A Review of FPGA-Based Custom Computing Architecture for Convolutional Neural Network Inference
    PENG Xiyuan
    YU Jinxiang
    YAO Bowen
    LIU Liansheng
    PENG Yu
    Chinese Journal of Electronics, 2021, 30 (01) : 1 - 17
  • [23] An Efficient FPGA-Based Convolutional Neural Network for Classification: Ad-MobileNet
    Bouguezzi, Safa
    Ben Fredj, Hana
    Belabed, Tarek
    Valderrama, Carlos
    Faiedh, Hassene
    Souani, Chokri
    ELECTRONICS, 2021, 10 (18)
  • [24] A Review of FPGA-Based Custom Computing Architecture for Convolutional Neural Network Inference
    Peng Xiyuan
    Yu Jinxiang
    Yao Bowen
    Liu Liansheng
    Peng Yu
    CHINESE JOURNAL OF ELECTRONICS, 2021, 30 (01) : 1 - 17
  • [25] FPGA-based Implementation of Hand Gesture Recognition Using Convolutional Neural Network
    Zhang, Tongtong
    Zhou, Weiguo
    Jiang, Xin
    Liu, Yunhui
    2018 IEEE INTERNATIONAL CONFERENCE ON CYBORG AND BIONIC SYSTEMS (CBS), 2018, : 133 - 138
  • [26] Optimisation of FPGA-Based Designs for Convolutional Neural Networks
    Bonifus, P. L.
    Thomas, Ann Mary
    Antony, Jobin K.
    SMART SENSORS MEASUREMENT AND INSTRUMENTATION, CISCON 2021, 2023, 957 : 209 - 221
  • [27] FPGA-Based Acceleration for Bayesian Convolutional Neural Networks
    Fan, Hongxiang
    Ferianc, Martin
    Que, Zhiqiang
    Liu, Shuanglong
    Niu, Xinyu
    Rodrigues, Miguel R. D.
    Luk, Wayne
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (12) : 5343 - 5356
  • [28] An FPGA-Based Processor for Training Convolutional Neural Networks
    Liu, Zhiqiang
    Dou, Yong
    Jiang, Jingfei
    Wang, Qiang
    Chow, Paul
    2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 207 - 210
  • [29] Convolutional Neural Networks using FPGA-based Pipelining
    Ali G.A.
    Ali A.H.
    Iraqi Journal for Computer Science and Mathematics, 2023, 4 (02): : 215 - 223
  • [30] An Efficient FPGA-Based Architecture for Convolutional Neural Networks
    Hwang, Wen-Jyi
    Jhang, Yun-Jie
    Tai, Tsung-Ming
    2017 40TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2017, : 582 - 588