FPGA based convolution and memory architecture for Convolutional Neural Network

被引：2

作者：

Shahan, K. A. ^{[1
]}

Rani, Sheeba J. ^{[1
]}

机构：

[1] Indian Inst Space Sci & Technol, Dept Avion, Thiruvananthapuram, Kerala, India

来源：

2020 33RD INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2020 19TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (VLSID) | 2020年

关键词：

convolution; neural network; winograd efficient; hardware; architecture; deep convolutional neural network; memory reuse; FPGA;

D O I：

10.1109/VLSID49098.2020.00049

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Convolutional Neural Networks(CNNs) are widely used in vision based applications to increase the performance but at the cost of higher storage and increase in computation. Hardware implementations of CNN are limited by the computational complexity and bandwidth while accessing off-chip memory. In this work a novel FPGA based hardware architecture for 2D convolution operation with reduced computational complexity using Winograd's 2D minimal filtering algorithm and a memory architecture to reduce on-chip read operations to access adjacent input data tiles for convolution operations is proposed to accelerate CNNs. An on-chip memory bank reuse architecture is also utilized to reduce the number of memory read and write operations to off-chip memory. The proposed architecture for convolution operation achieves lower computational complexity by reducing the number of multiplication operations without proportionate increase in number of addition operations compared to prior implementations. The number of data read operations from on-chip memory is reduced by 4 times and using the on-chip memory bank reuse scheme latency associated with accessing intermediate data is reduced. The implemented uses 16-bit fixed point representation which could reduce bit width to save area and energy. Virtex Ultra scale+ VCU118 Evaluation Board 2.0 populated with XCVU9P-L2FLGA2104 is used as the platform for implementing the design. VGG Net based CNN is used for the implementation. The computation time for individual convolutional layer is also estimated and it is found to be reduced. For a 3x3 kernel the number of multiplications is reduced to 4 from 9 compared to standard convolution operation and the number of addition operations reduced to 12 from 14 compared to prior hardware implementations of Winograd's 2D minimal filtering algorithm.

引用

页码：183 / 188

页数：6

共 50 条

[31] Low Energy Domain Wall Memory based Convolution Neural Network Design with Optimizing MAC Architecture
Kim, Jooyoon
Jang, Yunho
Kim, Taehwan
Park, Jongsun
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
[32] Dilated Convolution and Residual Network based Convolutional Neural Network for Recognition of Disastrous Events
Shafique, Dania
Akram, Muhammad Usman
Hassan, Taimur
Anwar, Tahira
Salam, Anum Abdul
2022 IEEE INTERNATIONAL SYMPOSIUM ON ROBOTIC AND SENSORS ENVIRONMENTS (ROSE), 2022,
[33] An Efficient Convolutional Neural Network Accelerator on FPGA
Si, Junye
Jiang, Jianfei
Wang, Qin
Huang, Jia
2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1392 - 1394
[34] Development of cultural tourism platform based on FPGA and convolutional neural network
Yin, Xinzhe
Li, Jinghua
MICROPROCESSORS AND MICROSYSTEMS, 2021, 80
[35] Modulation recognition using an FPGA-based convolutional neural network
Liu, Xueyuan
Shang, Jing
Leong, Philip H. W.
Liu, Cheng
2019 22ND INTERNATIONAL CONFERENCE ON ELECTRICAL MACHINES AND SYSTEMS (ICEMS 2019), 2019, : 3165 - 3170
[36] A Sliding-Kernel Computation-In-Memory Architecture for Convolutional Neural Network
Hu, Yushen
Xie, Xinying
Lei, Tengteng
Shi, Runxiao
Wong, Man
ADVANCED SCIENCE, 2024, 11 (46)
[37] Design of FPGA Based Convolutional Neural Network Co-Processor
Yang Y.
Zhang G.
Liang F.
He P.
Wu B.
Gao Z.
Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2018, 52 (07): : 153 - 159
[38] Development of tourism resources based on fpga microprocessor and convolutional neural network
Yu, Huixia
MICROPROCESSORS AND MICROSYSTEMS, 2021, 82
[39] An FPGA-based Accelerator Platform Implements for Convolutional Neural Network
Meng, Xiao
Yu, Lixin
Qin, Zhiyong
2019 THE 3RD INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPILATION, COMPUTING AND COMMUNICATIONS (HP3C 2019), 2019, : 25 - 28
[40] An FPGA Realization of OpenPose based on a Sparse Weight Convolutional Neural Network
Jinguji, Akira
Fujii, Tomoya
Sato, Shimpei
Nakahara, Hiroki
2018 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT 2018), 2018, : 313 - 316

← 1 2 3 4 5 →