FPGA based convolution and memory architecture for Convolutional Neural Network

被引:2
|
作者
Shahan, K. A. [1 ]
Rani, Sheeba J. [1 ]
机构
[1] Indian Inst Space Sci & Technol, Dept Avion, Thiruvananthapuram, Kerala, India
关键词
convolution; neural network; winograd efficient; hardware; architecture; deep convolutional neural network; memory reuse; FPGA;
D O I
10.1109/VLSID49098.2020.00049
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional Neural Networks(CNNs) are widely used in vision based applications to increase the performance but at the cost of higher storage and increase in computation. Hardware implementations of CNN are limited by the computational complexity and bandwidth while accessing off-chip memory. In this work a novel FPGA based hardware architecture for 2D convolution operation with reduced computational complexity using Winograd's 2D minimal filtering algorithm and a memory architecture to reduce on-chip read operations to access adjacent input data tiles for convolution operations is proposed to accelerate CNNs. An on-chip memory bank reuse architecture is also utilized to reduce the number of memory read and write operations to off-chip memory. The proposed architecture for convolution operation achieves lower computational complexity by reducing the number of multiplication operations without proportionate increase in number of addition operations compared to prior implementations. The number of data read operations from on-chip memory is reduced by 4 times and using the on-chip memory bank reuse scheme latency associated with accessing intermediate data is reduced. The implemented uses 16-bit fixed point representation which could reduce bit width to save area and energy. Virtex Ultra scale+ VCU118 Evaluation Board 2.0 populated with XCVU9P-L2FLGA2104 is used as the platform for implementing the design. VGG Net based CNN is used for the implementation. The computation time for individual convolutional layer is also estimated and it is found to be reduced. For a 3x3 kernel the number of multiplications is reduced to 4 from 9 compared to standard convolution operation and the number of addition operations reduced to 12 from 14 compared to prior hardware implementations of Winograd's 2D minimal filtering algorithm.
引用
收藏
页码:183 / 188
页数:6
相关论文
共 50 条
  • [21] FPGA BASED IMPLEMENTATION OF CONVOLUTIONAL NEURAL NETWORK FOR HYPERSPECTRAL CLASSIFICATION
    Chen, Xiaofeng
    Ji, Jingyu
    Mei, Shaohui
    Zhang, Yifan
    Han, Manli
    Du, Qian
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 2451 - 2454
  • [22] FPGA-based Convolutional Neural Network Design and Implementation
    Yan, Ruitao
    Yi, Jianjun
    He, Jie
    Zhao, Yifan
    2023 3RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE, ACCTCS, 2023, : 456 - 460
  • [23] FPGA-based Convolution Neural Network for Traffic Sign Recognition
    Yao, Yuchen
    Zhang, Zhiqian
    Yang, Zhen
    Wang, Jian
    Lai, Jinmei
    2017 IEEE 12TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2017, : 891 - 894
  • [24] FPGA Implementation of Face Recognition System Based on Convolution Neural Network
    Qiao, Shijie
    Ma, Jie
    2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 2430 - 2434
  • [25] Coupled convolution layer for convolutional neural network
    Uchida, Kazutaka
    Tanaka, Masayuki
    Okutomi, Masatoshi
    NEURAL NETWORKS, 2018, 105 : 197 - 205
  • [26] Coupled Convolution Layer for Convolutional Neural Network
    Uchida, Kazutaka
    Tanaka, Masayuki
    Okutomi, Masatoshi
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 3548 - 3553
  • [27] Efficient Convolution Architectures for Convolutional Neural Network
    Wang, Jichen
    Lin, Jun
    Wang, Zhongfeng
    2016 8TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS & SIGNAL PROCESSING (WCSP), 2016,
  • [28] Runtime Programmable and Memory Bandwidth Optimized FPGA-Based Coprocessor for Deep Convolutional Neural Network
    Shah, Nimish
    Chaudhari, Paragkumar
    Varghese, Kuruvilla
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (12) : 5922 - 5934
  • [29] In-Memory Computing Architecture for a Convolutional Neural Network Based on Spin Orbit Torque MRAM
    Huang, Jun-Ying
    Syu, Jing-Lin
    Tsou, Yao-Tung
    Kuo, Sy-Yen
    Chang, Ching-Ray
    ELECTRONICS, 2022, 11 (08)
  • [30] FFConv: An FPGA-based Accelerator for Fast Convolution Layers in Convolutional Neural Networks
    Ahmad, Afzal
    Pasha, Muhammad Adeel
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2020, 19 (02)