FlexCNN: An End-to-end Framework for Composing CNN Accelerators on FPGA

被引:19
|
作者
Basalama, Suhail [1 ]
Sohrabizadeh, Atefeh [1 ]
Wang, Jie [1 ]
Guo, Licheng [1 ]
Cong, Jason [1 ]
机构
[1] Univ Calif Los Angeles, 404 Westwood Blvd Engn,6 Room 468, Los Angeles, CA 90095 USA
关键词
FPGA; CNN; ONNX; systolic array; transposed convolution; dilated convolution; OpenPose; U-Net; E-Net;
D O I
10.1145/3570928
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With reduced data reuse and parallelism, recent convolutional neural networks (CNNs) create new challenges for FPGA acceleration. Systolic arrays (SAs) are efficient, scalable architectures for convolutional layers, but without proper optimizations, their efficiency drops dramatically for reasons: (1) the different dimensions within same-type layers, (2) the different convolution layers especially transposed and dilated convolutions, and (3) CNN's complex dataflow graph. Furthermore, significant overheads arise when integrating FPGAs into machine learning frameworks. Therefore, we present a flexible, composable architecture called FlexCNN, which delivers high computation efficiency by employing dynamic tiling, layer fusion, and data layout optimizations. Additionally, we implement a novel versatile SA to process normal, transposed, and dilated convolutions efficiently. FlexCNN also uses a fully pipelined software-hardware integration that alleviates the software overheads. Moreover, with an automated compilation flow, FlexCNN takes a CNN in the ONNX1 representation, performs a design space exploration, and generates an FPGA accelerator. The framework is tested using three complex CNNs: OpenPose, U-Net, and E-Net. The architecture optimizations achieve 2.3x performance improvement. Compared to a standard SA, the versatile SA achieves close-to-ideal speedups, with up to 15.98x and 13.42x for transposed and dilated convolutions, with a 6% average area overhead. The pipelined integration leads to a 5x speedup for OpenPose.
引用
收藏
页数:32
相关论文
共 50 条
  • [41] A Framework for end-to-end approach to Systems Integration
    Jain R.
    Chandrasekaran A.
    Erol O.
    International Journal of Industrial and Systems Engineering, 2010, 5 (01) : 79 - 109
  • [42] An end-to-end framework for safe software development
    Hussein, Mahmoud
    Nouacer, Reda
    Radermacher, Ansgar
    Puccetti, Armand
    Gaston, Christophe
    Rapin, Nicolas
    MICROPROCESSORS AND MICROSYSTEMS, 2018, 62 : 41 - 49
  • [43] PRIMA: an End-to-End Framework for Privacy at Scale
    Antonatos, Spiros
    Braghin, Stefano
    Holohan, Naoise
    Gkoufas, Yiannis
    Mac Aonghusa, Pol
    2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 1531 - 1542
  • [44] An End-to-End Learning Framework for Video Compression
    Lu, Guo
    Zhang, Xiaoyun
    Ouyang, Wanli
    Chen, Li
    Gao, Zhiyong
    Xu, Dong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (10) : 3292 - 3308
  • [45] A Formal Framework for End-to-End DNS Resolution
    Liu, Si
    Duan, Huayi
    Heimes, Lukas
    Bearzi, Marco
    Vieli, Jodok
    Basin, David
    Perrig, Adrian
    PROCEEDINGS OF THE 2023 ACM SIGCOMM 2023 CONFERENCE, SIGCOMM 2023, 2023, : 932 - 949
  • [46] End-to-end enterprise monitoring framework for NetOps
    Hershey, Paul
    Runyon, Donald
    Wang, Yangwei
    MILCOM 2006, VOLS 1-7, 2006, : 2526 - +
  • [47] A framework for end-to-end verification for digital microfluidics
    Roy, Pushpita
    Banerjee, Ansuman
    Bhattacharya, Bhargab B.
    INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2021, 17 (03) : 231 - 245
  • [48] A framework for end-to-end proactive network management
    Hariri, S
    Kim, Y
    Varshney, K
    Kaminski, R
    Hague, D
    Maciag, C
    NOMS '98 - 1998 IEEE NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, VOLS 1-3, 1998, : 280 - 286
  • [49] An end-to-end home network security framework
    Tak, S
    Dixit, S
    Park, EK
    COMPUTER COMMUNICATIONS, 2004, 27 (05) : 412 - 422
  • [50] End-to-End Scalable FPGA Accelerator for Deep Residual Networks
    Ma, Yufei
    Kim, Minkyu
    Cao, Yu
    Vrudhula, Sarma
    Seo, Jae-sun
    2017 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2017, : 456 - 459