An OpenCL-Based FPGA Accelerator for Faster R-CNN

被引:5
|
作者
An, Jianjing [1 ,2 ]
Zhang, Dezheng [1 ,2 ]
Xu, Ke [1 ,2 ]
Wang, Dong [1 ,2 ]
机构
[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing 100044, Peoples R China
[2] Beijing Jiaotong Univ, Beijing Key Lab Adv Informat Sci & Network Techno, Beijing 100044, Peoples R China
基金
北京市自然科学基金;
关键词
convolutional neural network; Faster R-CNN; FPGA; hardware accelerator;
D O I
10.3390/e24101346
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
In recent years, convolutional neural network (CNN)-based object detection algorithms have made breakthroughs, and much of the research corresponds to hardware accelerator designs. Although many previous works have proposed efficient FPGA designs for one-stage detectors such as Yolo, there are still few accelerator designs for faster regions with CNN features (Faster R-CNN) algorithms. Moreover, CNN's inherently high computational complexity and high memory complexity bring challenges to the design of efficient accelerators. This paper proposes a software-hardware co-design scheme based on OpenCL to implement a Faster R-CNN object detection algorithm on FPGA. First, we design an efficient, deep pipelined FPGA hardware accelerator that can implement Faster R-CNN algorithms for different backbone networks. Then, an optimized hardware-aware software algorithm was proposed, including fixed-point quantization, layer fusion, and a multi-batch Regions of interest (RoIs) detector. Finally, we present an end-to-end design space exploration scheme to comprehensively evaluate the performance and resource utilization of the proposed accelerator. Experimental results show that the proposed design achieves a peak throughput of 846.9 GOP/s at the working frequency of 172 MHz. Compared with the state-of-the-art Faster R-CNN accelerator and the one-stage YOLO accelerator, our method achieves 10 x and 2.1 x inference throughput improvements, respectively.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] An OpenCL-Based Hybrid CNN-RNN Inference Accelerator On FPGA
    Sun, Yunfei
    Liu, Brian
    Xu, Xianchao
    2019 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2019), 2019, : 283 - 286
  • [2] OpenCL-based design of an FPGA accelerator for quantum annealing simulation
    Hasitha Muthumala Waidyasooriya
    Masanori Hariyama
    Masamichi J. Miyama
    Masayuki Ohzeki
    The Journal of Supercomputing, 2019, 75 : 5019 - 5039
  • [3] OpenCL-based design of an FPGA accelerator for quantum annealing simulation
    Waidyasooriya, Hasitha Muthumala
    Hariyama, Masanori
    Miyama, Masamichi J.
    Ohzeki, Masayuki
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (08): : 5019 - 5039
  • [4] An OpenCL-Based FPGA Accelerator for Compressed YOLOv2
    Yang, Anrong
    Li, Yuanhui
    Shu, Hongqiao
    Deng, Jianlin
    Ma, Chuanzhao
    Li, Zheng
    Wang, Qigang
    2019 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2019), 2019, : 235 - 238
  • [5] A Scalable OpenCL-Based FPGA Accelerator For YOLOv2
    Xu, Ke
    Wang, Xiaoyun
    Wang, Dong
    2019 27TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2019, : 317 - 317
  • [6] Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network
    Zhang, Jialiang
    Li, Jing
    FPGA'17: PROCEEDINGS OF THE 2017 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, 2017, : 25 - 34
  • [7] Improving the Performance of Whale Optimization Algorithm through OpenCL-Based FPGA Accelerator
    Jiang, Qiangqiang
    Guo, Yuanjun
    Yang, Zhile
    Wang, Zheng
    Yang, Dongsheng
    Zhou, Xianyu
    COMPLEXITY, 2020, 2020
  • [8] Crack Detection and Comparison Study Based on Faster R-CNN and Mask R-CNN
    Xu, Xiangyang
    Zhao, Mian
    Shi, Peixin
    Ren, Ruiqi
    He, Xuhui
    Wei, Xiaojun
    Yang, Hao
    SENSORS, 2022, 22 (03)
  • [9] CAPTCHA Recognition Based on Faster R-CNN
    Du, Feng-Lin
    Li, Jia-Xing
    Yang, Zhi
    Chen, Peng
    Wang, Bing
    Zhang, Jun
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2017, PT II, 2017, 10362 : 597 - 605
  • [10] OpenCL-Based Design of an FPGA Accelerator for H.266/VVC Transform and Quantization
    Waidyasooriya, Hasitha Muthumala
    Hariyama, Masanori
    Iwasaki, Hiroe
    Kobayashi, Daisuke
    Omori, Yuya
    Nakamura, Ken
    Nitta, Koyo
    Sano, Kimikazu
    2022 IEEE 65TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS 2022), 2022,