Comprehensive Evaluation of OpenCL-Based CNN Implementations for FPGAs

被引:1
|
作者
Tapiador-Morales, Ricardo [1 ]
Rios-Navarro, Antonio [1 ]
Linares-Barranco, Alejandro [1 ]
Kim, Minkyu [2 ]
Kadetotad, Deepak [2 ]
Seo, Jae-sun [2 ]
机构
[1] Univ Seville, Robot & Technol Comp Lab, Seville, Spain
[2] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ USA
关键词
Deep learning; Convolutional Neural Network; Hardware acceleration; OpenCL; FPGA; Caffe; Xilinx; Altera;
D O I
10.1007/978-3-319-59147-6_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning has significantly advanced the state of the art in artificial intelligence, gaining wide popularity from both industry and academia. Special interest is around Convolutional Neural Networks (CNN), which take inspiration from the hierarchical structure of the visual cortex, to form deep layers of convolutional operations, along with fully connected classifiers. Hardware implementations of these deep CNN architectures are challenged with memory bottlenecks that require many convolution and fully-connected layers demanding large amount of communication for parallel computation. Multi-core CPU based solutions have demonstrated their inadequacy for this problem due to the memory wall and low parallelism. Many-core GPU architectures show superior performance but they consume high power and also have memory constraints due to inconsistencies between cache and main memory. OpenCL is commonly used to describe these architectures for their execution on GPGPUs or FPGAs. FPGA design solutions are also actively being explored, which allow implementing the memory hierarchy using embedded parallel BlockRAMs. This boosts the parallel use of shared memory elements between multiple processing units, avoiding data replicability and inconsistencies. This makes FPGAs potentially powerful solutions for real-time classification of CNNs. In this paper both Altera and Xilinx adopted OpenCL co-design frameworks for pseudo-automatic development solutions are evaluated. A comprehensive evaluation and comparison for a 5-layer deep CNN is presented. Hardware resources, temporal performance and the OpenCL architecture for CNNs are discussed. Xilinx demonstrates faster synthesis, better FPGA resource utilization and more compact boards. Altera provides multi-platforms tools, mature design community and better execution times.
引用
收藏
页码:271 / 282
页数:12
相关论文
共 50 条
  • [31] OpenCL-based GPU acceleration of ISPH simulation for incompressible flows
    Qiu, Liuchao
    ADVANCES IN COMPUTATIONAL MODELING AND SIMULATION, PTS 1 AND 2, 2014, 444-445 : 380 - 384
  • [32] AIWC: OpenCL-based Architecture-Independent Workload Characterization
    Johnston, Beau
    Milthorpe, Josh
    PROCEEDINGS OF LLVM-HPC 2018: IEEE/ACM 5TH WORKSHOP ON THE LLVM COMPILER INFRASTRUCTURE IN HPC (LLVM-HPC), 2018, : 81 - 91
  • [33] OpenCL-based design of an FPGA accelerator for quantum annealing simulation
    Waidyasooriya, Hasitha Muthumala
    Hariyama, Masanori
    Miyama, Masamichi J.
    Ohzeki, Masayuki
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (08): : 5019 - 5039
  • [34] An OpenCL-Based FPGA Accelerator for Compressed YOLOv2
    Yang, Anrong
    Li, Yuanhui
    Shu, Hongqiao
    Deng, Jianlin
    Ma, Chuanzhao
    Li, Zheng
    Wang, Qigang
    2019 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2019), 2019, : 235 - 238
  • [35] Discretization and OpenCL-based implementation of social field pedestrian model
    Yu, Bin
    Wu, Kaiteng
    Zhu, Ke
    Ji, Yuhan
    Zhang, Michael
    Yang, Xinwen
    NEUROCOMPUTING, 2018, 315 : 299 - 309
  • [36] Acceleration of stochastic seismic inversion in OpenCL-based heterogeneous platforms
    Ferreirinha, Tomas
    Nunes, Ruben
    Azevedo, Leonardo
    Soares, Amilcar
    Pratas, Frederico
    Tomas, Pedro
    Roma, Nuno
    COMPUTERS & GEOSCIENCES, 2015, 78 : 26 - 36
  • [37] Studying OpenCL-based Number Theoretic Transform for heterogeneous platforms
    Haleplidis, Evangelos
    Tsakoulis, Thanasis
    El-Kady, Alexander
    Dimopoulos, Charis
    Koufopavlou, Odysseas
    Fournaris, Apostolos P.
    2021 24TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD 2021), 2021, : 339 - 346
  • [38] Toward In-System Monitoring of OpenCL-Based Designs on FPGA
    Bensalem, Hachem
    Blaquiere, Yves
    Savaria, Yvon
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [39] An OpenCL-based Framework for Rapid Virtual Prototyping of Heterogeneous Architectures
    Sotiriou-Xanthopoulos, Efstathios
    Masing, Leonard
    Siozios, Kostas
    Economakos, George
    Soudris, Dimitrios
    Becker, Juergen
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING AND SIMULATION (SAMOS), 2016, : 372 - 377
  • [40] A Scalable OpenCL-Based FPGA Accelerator For YOLOv2
    Xu, Ke
    Wang, Xiaoyun
    Wang, Dong
    2019 27TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2019, : 317 - 317