An Adaptive Row-based Weight Reuse Scheme for FPGA Implementation of Convolutional Neural Networks

被引:0
|
作者
Je, Hyeonseung [1 ]
Duy Thanh Nguyen [1 ]
Lee, Kyujoong [2 ]
Lee, Hyuk-Jae [1 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul, South Korea
[2] Sunmoon Univ, Dept Elect Engn, Asan, South Korea
关键词
FPGA; Convolutional neural networks; U-Net; Row-reuse scheme; Adaptive;
D O I
10.1109/ITC-CSCC52171.2021.9501490
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
There is an increasing need to implement the Convolutional Neural network (CNN) with an FPGA thanks to its design flexibility over an ASIC and low power consumption over a GPU. The size of the network and the resource of the target FPGA board should be considered to deploy the CNN Network successfully. However, previous works use the fixed dataflow which is not optimized for each layer. As a result, high on-chip buffer utilization and frequent memory access are required. The row-based weight reuse scheme is efficient in reducing input/output buffer size. However, it causes resource underutilization for layers with small feature maps size. This paper proposes an adaptive row reuse scheme by applying each level of row-reuse for each layer depending on its characteristic. Finally, the proposed design is implemented with a Xilinx KCU1500 board, and the accelerator achieves 994.74 GOPS of the throughput for U-Net. For general CNN implementation, the proposed scheme achieves 1080.9 GOPS when running VGG16 with 1.7 times less buffer size compared to previous works.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Implementation of energy-efficient fast convolution algorithm for deep convolutional neural networks based on FPGA
    Li, W. -J.
    Ruan, S. -J.
    Yang, D. -S.
    ELECTRONICS LETTERS, 2020, 56 (10) : 485 - 487
  • [32] High Energy Efficiency FPGA-based Accelerator for Convolutional Neural Networks Using Weight Combination
    Shu, Chenghao
    Pang, Wei
    Liu, Hao
    Lu, Shengli
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2019), 2019, : 578 - 582
  • [33] An FPGA Realization of OpenPose based on a Sparse Weight Convolutional Neural Network
    Jinguji, Akira
    Fujii, Tomoya
    Sato, Shimpei
    Nakahara, Hiroki
    2018 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT 2018), 2018, : 313 - 316
  • [34] On the Reliability of Convolutional Neural Network Implementation on SRAM-based FPGA
    Du, Boyang
    Azimi, Sarah
    De Sio, Corrado
    Bozzoli, Ludovica
    Sterpone, Luca
    2019 IEEE INTERNATIONAL SYMPOSIUM ON DEFECT AND FAULT TOLERANCE IN VLSI AND NANOTECHNOLOGY SYSTEMS (DFT), 2019,
  • [35] Application of Bit-Serial Arithmetic Units for FPGA Implementation of Convolutional Neural Networks
    Csordas, G.
    Feher, B.
    Kovacshazy, T.
    2018 19TH INTERNATIONAL CARPATHIAN CONTROL CONFERENCE (ICCC), 2018, : 322 - 327
  • [36] FPGA Implementation of an Adaptive LSB Replacement Based Digital Watermarking Scheme
    Roy, Subhrajit Sinha
    Das, Manisha
    Basu, Abhishek
    Chattopadhyay, Avik
    2018 INTERNATIONAL SYMPOSIUM ON DEVICES, CIRCUITS AND SYSTEMS (ISDCS), 2018,
  • [37] A Scalable FPGA Accelerator for Convolutional Neural Networks
    Xu, Ke
    Wang, Xiaoyun
    Fu, Shihang
    Wang, Dong
    ADVANCED COMPUTER ARCHITECTURE, 2018, 908 : 3 - 14
  • [38] OPU: An FPGA-Based Overlay Processor for Convolutional Neural Networks
    Yu, Yunxuan
    Wu, Chen
    Zhao, Tiandong
    Wang, Kun
    He, Lei
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (01) : 35 - 47
  • [39] FPGA-based Accelerator for Losslessly Quantized Convolutional Neural Networks
    Sit, Mankit
    Kazami, Ryosuke
    Amano, Hideharu
    2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 295 - 298
  • [40] Composite FPGA-based Accelerator for Deep Convolutional Neural Networks
    HuanZhang
    YuanYang
    YangXiao
    2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRON DEVICES AND SOLID-STATE CIRCUITS (EDSSC), 2019,