In-FPGA Instrumentation Framework for OpenCL-Based Designs

被引：2

作者：

Bensalem, Hachem ^{[1
]}

Blaquiere, Yves ^{[1
]}

Savaria, Yvon ^{[2
]}

机构：

[1] Ecole Technol Super, Dept Elect Engn, Montreal, PQ H3C 1K3, Canada

[2] Polytech Montreal, Dept Elect Engn, Montreal, PQ H3T 1J4, Canada

来源：

IEEE ACCESS | 2020年 / 8卷 / 08期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Field programmable gate arrays; Instruments; Tools; Kernel; Debugging; Hardware; Benchmark testing; OpenCL; FPGA; instrumentation; high-performance reconfigurable computing; HLS; timing performance;

D O I：

10.1109/ACCESS.2020.3040081

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The productivity achieved when developing applications on high-performance reconfigurable heterogeneous computing (HPRHC) systems is increased by using the Open Computing Language (OpenCL). However, the hardware produced by OpenCL compilers in field-programmable gate arrays (FPGAs) can result in severe performance bottlenecks that are challenging to solve. The problem is compounded by the fact that the generated netlist details are disorganized, making them mostly unreadable and only partially visible to designers. This paper proposes an in-FPGA instrumentation method and a new framework for extracting the FPGA-cycle-accurate timing performances of OpenCL-based designs. The results clearly show that the chosen execution model for OpenCL-based designs strongly affects the timing performance when it is not properly implemented. Our framework is implemented on an HPRHC platform that contains a CPU and two Arria10 FPGAs, and it is evaluated with a wide variety of benchmarks with different complexities. After testing on the reported benchmarks, the average logic overhead for one inserted instrument is 0.2 % of the total amount of adaptive look-up tables (ALUTs) and 0.1 % of the total registers in an FPGA. This resource utilization is between 1.5 and six times lower than those reported in the best previously published works. The scalability of the framework is also evaluated by inserting up to 50 instruments. The experimental results show that the average logic utilization per instrument is 0.19 % of the ALUTs and 0.17 % of the registers in the FPGA when 50 instruments are inserted.

引用

页码：212979 / 212994

页数：16

共 50 条

[21] Using Machine Learning to Estimate Utilization and Throughput for OpenCL-Based SpMV Implementation on an FPGA
Naher, Jannatun
Gloster, Clay
Jadhav, Shrikant S.
Doss, Christopher C.
IEEE SOUTHEASTCON 2020, 2020,
[22] OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes
Kenter, Tobias
Mahale, Gopinath
Alhaddad, Samer
Grynko, Yevgen
Foerstner, Jens
Plessl, Christian
Schmitt, Christian
Afzal, Ayesha
Hannig, Frank
PROCEEDINGS 26TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2018), 2018, : 189 - 196
[23] Realization and Optimization of Pulse Compression Algorithm on OpenCL-Based FPGA Heterogeneous Computing Platform
Yu, Jiacheng
Li, Xingming
Hu, Shanqing
Wang, Yuwei
SIGNAL AND INFORMATION PROCESSING, NETWORKING AND COMPUTERS, 2018, 473 : 147 - 155
[24] An OpenCL-based Acceleration for Canny Algorithm Using a Heterogeneous CPU-FPGA Platform
Rahamneh, Samah
Sawalha, Lina
2019 27TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2019, : 322 - 322
[25] OpenCL-Based Design of an FPGA Accelerator for H.266/VVC Transform and Quantization
Waidyasooriya, Hasitha Muthumala
Hariyama, Masanori
Iwasaki, Hiroe
Kobayashi, Daisuke
Omori, Yuya
Nakamura, Ken
Nitta, Koyo
Sano, Kimikazu
2022 IEEE 65TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS 2022), 2022,
[26] FSCHOL: An OpenCL-based HPC Framework for Accelerating Sparse Cholesky Factorization on FPGAs
Tavakoli, Erfan Bank
Riera, Michael
Quraishi, Masudul Hassan
Ren, Fengbo
2021 IEEE 33RD INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2021), 2021, : 209 - 220
[27] PipeCNN: An OpenCL-Based Open-Source FPGA Accelerator for Convolution Neural Networks
Wang, Dong
Xu, Ke
Jiang, Diankun
2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 279 - 282
[28] OpenCL-Based Erasure Coding on Heterogeneous Architectures
Chen, Guoyang
Zhou, Huiyang
Shen, Xipeng
Gahm, Josh
Venkat, Narayan
Booth, Skip
Marshall, John
2016 IEEE 27TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP), 2016, : 33 - 40
[29] An OpenCL-based SIFT Accelerator for Image Features Extraction on FPGA in Mobile Edge Computing Environment
Duc Canh Le
Oh, Eun Young
Jeong, Jae Ho
Kim, Sung Hyun
Jeon, Minsu
Jang, Jonghyun
Youn, Chan-Hyun
2018 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2018, : 1406 - 1410
[30] EDSSA: An Encoder-Decoder Semantic Segmentation Networks Accelerator on OpenCL-Based FPGA Platform
Huang, Hongzhi
Wu, Yakun
Yu, Mengqi
Shi, Xuesong
Qiao, Fei
Luo, Li
Wei, Qi
Liu, Xinjun
SENSORS, 2020, 20 (14) : 1 - 18

← 1 2 3 4 5 →