Pflow: An end-to-end heterogeneous acceleration framework for CNN inference on FPGAs

被引：0

作者：

Wan, Yi ^{[1
]}

Xie, Xianzhong ^{[1
]}

Yi, Lingjie ^{[1
]}

Jiang, Bo ^{[1
]}

Chen, Junfan ^{[2
]}

Jiang, Yi ^{[1
]}

机构：

[1] Chongqing Univ Posts & Telecommun, Coll Comp Sci & Technol, Chongqing 400065, Peoples R China

[2] Chongqing Haiyunjiexun Technol Co Ltd, Chongqing, Peoples R China

来源：

JOURNAL OF SYSTEMS ARCHITECTURE | 2024年 / 150卷

关键词：

Heterogeneous computing; Computation graph reconstruction; Acceleration framework; FPGA; CONVOLUTIONAL NEURAL-NETWORKS; DESIGN; FLOW;

D O I：

10.1016/j.sysarc.2024.103113

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Field -Programmable Gate Arrays (FPGAs), renowned for their high performance per watt, are extensively utilized to accelerate Convolutional Neural Networks (CNNs) in edge computing environments, primarily employing dataflow-based and instruction set -based approaches. Compared to the instruction set -based approach that features fast and versatile circuit design, the dataflow-based approach can significantly enhance performance at the expense of design versatility. Nevertheless, edge computing environments require both high energy efficiency and adaptability to various scenarios. This paper proposes a novel end -to -end heterogeneous acceleration framework for CNN inference on FPGAs, named Pflow. The basic idea is to decouple network deployment and hardware details with a hardware-software co -design approach. First, a dataflow accelerator with an adaptive scheduling strategy is proposed. The adaptive scheduling strategy, along with a scalable design, maximizes hardware utilization in terms of computing resources and bandwidth. Secondly, we design a novel operator -perception method to automate the processes of network reconstruction and operator fusion. Thirdly, we integrate Pflow into the industrial -grade deep learning framework Paddle-Lite. We evaluate Pflow by implementing several networks on two representative FPGA platforms. Experimental results demonstrate that Pflow achieves energy efficiencies of 46.5 GOPS/W on Xilinx Zynq Ultrascale+ MPSoC 3EG and 59.4 GOPS/W on Virtex UltraScale+ XCVU13P. It also reaches a throughput of up to 255.7 GOPS on the former and 3.686 TOPS on the latter.

引用

页数：14

共 50 条

[1] Code-Based Cryptography for Confidential Inference on FPGAs: An End-to-End Methodology
Karn, Rupesh Raj
Knechtel, Johann
Sinanoglu, Ozgur
2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024, 2024,
[2] DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos
Parger, Mathias
Tang, Chengcheng
Twigg, Christopher D.
Keskin, Cem
Wang, Robert
Steinberger, Markus
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12487 - 12496
[3] FlexCNN: An End-to-end Framework for Composing CNN Accelerators on FPGA
Basalama, Suhail
Sohrabizadeh, Atefeh
Wang, Jie
Guo, Licheng
Cong, Jason
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2023, 16 (02)
[4] End-to-end Quality of Service Framework for Heterogeneous Networks
Baldi, Mario
Giacomelli, Riccardo
2009 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT - WORKSHOPS, 2009, : 245 - 248
[5] Sparse R-CNN: An End-to-End Framework for Object Detection
Sun, Peize
Zhang, Rufeng
Jiang, Yi
Kong, Tao
Xu, Chenfeng
Zhan, Wei
Tomizuka, Masayoshi
Yuan, Zehuan
Luo, Ping
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15650 - 15664
[6] SNIFF: A Scalable Network Inference Framework for Measuring End-to-End Performance
Tang, Zhongzheng
Wang, Luning
Xu, Qian
Lu, Kejie
Wang, Jianping
Wu, Kui
Jia, Xiaohua
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2022, 9 (03): : 1909 - 1923
[7] An end-to-end RNS CNN Accelerator
Sakellariou, Vasilis
Paliouras, Vassilis
Kouretas, Ioannis
Saleh, Hani
Stouraitis, Thanos
2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 75 - 79
[8] GCONV Chain: Optimizing the Whole-Life Cost in End-to-end CNN Acceleration
Zhang, Jiaqi
Chen, Xiangru
Ray, Sandip
IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (09) : 2300 - 2312
[9] A focus module-based lightweight end-to-end CNN framework for voiceprint recognition
Karthikeyan Velayuthapandian
Suja Priyadharsini Subramoniam
Signal, Image and Video Processing, 2023, 17 : 2817 - 2825
[10] A CNN-Based End-to-End Learning Framework Toward Intelligent Communication Systems
Wu, Nan
Wang, Xudong
Lin, Bin
Zhang, Kaiyao
IEEE ACCESS, 2019, 7 : 110197 - 110204

← 1 2 3 4 5 →