FPGA Implementation of a SIMD-Based Array Processor with Torus Interconnect

被引：0

作者：

Murakami, Yuki ^{[1
]}

机构：

[1] Univ Aizu, Grad Sch Comp Sci & Engn, Aizu Wakamatsu, Fukushima, Japan

来源：

2015 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (FPT) | 2015年

关键词：

Matrix-Matrix Multiply-Add; Convolution; Convolutional Neural Networks; Array Processor;

D O I：

暂无

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Matrix computations are a fundamental tool in scientific and engineering applications. Among many such applications, Convolutional Neural Networks (CNN) that can be effectively computed by matrix-matrix multiplications are being popular and an efficient implementation of CNN is highly important. In this study, we have designed an parallel processor for the matrix computations using torus interconnect topology, and we implemented Cannon's algorithm for matrix-matrix multiply-add. We have evaluated the scalability of the proposed processor on a reconfigurable FPGA platform. More precisely, the designed processor with 8 x 8 functional units with 16 bit floating-point multiply-add unit was evaluated on Cyclone IV FPGA chip, with performance of 27 GFlops. We also implemented CNN calculations on our processor. We compared the matrix based approach and our proposed method. As a result, our method is 25 times faster than the matrix based approach if the processor has 8x8 functional units, image size is 32x32 and filter size is 5 x 5.

引用

页码：244 / 247

页数：4

共 50 条

[1] A Compact FPGA Implementation of a Bit-Serial SIMD Cellular Processor Array
Walsh, Declan
Dudek, Piotr
2012 13TH INTERNATIONAL WORKSHOP ON CELLULAR NANOSCALE NETWORKS AND THEIR APPLICATIONS (CNNA), 2012,
[2] Performance Comparison of SIMD-based HEVC Decoders on Mobile Processor
Nguyen Van Dien
Ryu, Eun-Seok
2017 PROCEEDINGS OF KICS-IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATIONS WITH SAMSUNG LTE & 5G SPECIAL WORKSHOP, 2017, : 298 - 303
[3] FPGA-based SIMD processor
Li, SYC
Cheuk, GCK
Lee, KH
Leong, PHW
FCCM 2003: 11TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2003, : 267 - 268
[4] An FPGA Implementation of 3D Numerical Simulations on a 2D SIMD Array Processor
Ishigaki, Yutaro
Tomioka, Yoichi
Shibata, Tsugumichi
Kitazawa, Hitoshi
2015 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2015, : 938 - 941
[5] New Scalable SIMD-Based Ray Caster Implementation for Virtual Machining
Leutgeb, Alexander
Welsch, Torsten
Hava, Michael
PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2013), PT I, 2014, 8384 : 317 - 326
[6] An FPGA based SIMD processor with a vector memory unit
Cho, Junho
Chang, Hoseok
Sung, Wonyong
2006 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, PROCEEDINGS, 2006, : 525 - +
[7] DESIGN AND IMPLEMENTATION OF A RING ARRAY OPTICAL INTERCONNECT FOR SIMD-MACHINES
WANG, JM
KANTERAKIS, E
KATZ, A
ZHANG, Y
LI, Y
OPTICAL COMPUTING, 1995, 139 : 173 - 176
[8] SIMD-Based Soft Error Detection
Chen, Zhi
Nicolau, Alexandru
Veidenbaum, Alexander V.
PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS (CF'16), 2016, : 45 - 54
[9] SIMD-Matcher: A SIMD-based Arbitrary Matching Framework
Wang, Ping
Wen, Fei
Gratz, Paul, V
Sprintson, Alex
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2022, 19 (03)
[10] SIMD PROCESSOR BASED IMPLEMENTATION OF RECURSIVE FILTERING EQUATIONS
Ahn, Jaewoo
Chang, Hoseok
Cho, Junho
Sung, Wonyong
SIPS: 2009 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS, 2009, : 87 - 92

← 1 2 3 4 5 →