A High-Level Modeling Framework for Estimating Hardware Metrics of CNN Accelerators

被引：9

作者：

Juracy, Leonardo Rezende ^{[1
]}

Moreira, Matheus Trevisan ^{[2
]}

Amory, Alexandre de Morais ^{[3
]}

Hampel, Alexandre F. ^{[1
]}

Moraes, Fernando Gehm ^{[1
]}

机构：

[1] Pontifical Catholic Univ Rio Grande Sul PUCRS, Sch Technol, BR-90619900 Porto Alegre, RS, Brazil

[2] Chronos Tech, San Diego, CA 92122 USA

[3] TeCIP Inst, Scuola Super SantAnna, I-56124 Pisa, Italy

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS | 2021年 / 68卷 / 11期

关键词：

Convolutional neural networks; Space exploration; Estimation; Computer architecture; Training; Hardware acceleration; Convolution; CNN; convolution hardware accelerator; system simulator; PPA; design space exploration;

D O I：

10.1109/TCSI.2021.3104644

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

GPUs became the reference platform for both training and inference phases of Convolutional Neural Networks (CNN) due to their tailored architecture to the CNN operators. However, GPUs are power-hungry architectures. A path to enable the deployment of CNNs in energy-constrained devices is adopting hardware accelerators for the inference phase. The design space exploration of CNNs using standard approaches, such as RTL, is limited due to their complexity. Thus, designers need frameworks enabling design space exploration that delivers accurate hardware estimation metrics to deploy CNNs. This work proposes a framework to explore CNNs design space, providing power, performance, and area (PPA) estimations. The heart of the framework is a system simulator. The system simulator front-end is TensorFlow, and the back-end is performance estimations obtained from the physical synthesis of hardware accelerators, not only from components like multipliers and adders. The first set of results evaluate the CNN accuracy using integer quantization, the accelerators PPA after physical synthesis, and the benefits of using a system simulator. These results allow a rich design space exploration, enabling selecting the best set of CNN parameters to meet the design constraints.

引用

页码：4783 / 4795

页数：13

共 50 条

[1] High-level synthesis of nonprogrammable hardware accelerators
Schreiber, Robert
Aditya, Shail
Rau, B. Ramakrishna
Kathail, Vinod
Mahlke, Scott
Abraham, Santosh
Snider, Greg
HP Laboratories Technical Report, 2000, (31):
[2] High-level synthesis of nonprogrammable hardware accelerators
Schreiber, R
Aditya, S
Rau, BR
Kathail, V
Mahlke, S
Abraham, S
Snider, G
IEEE INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES, AND PROCESSORS, PROCEEDINGS, 2000, : 113 - 124
[3] LogCA: A High-Level Performance Model for Hardware Accelerators
Bin Altaf, Muhammad Shoaib
Wood, David A.
44TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2017), 2017, : 375 - 388
[4] Securing Hardware Accelerators: A New Challenge for High-Level Synthesis
Pilato, Christian
Garg, Siddharth
Wu, Kaijie
Karri, Ramesh
Regazzoni, Francesco
IEEE EMBEDDED SYSTEMS LETTERS, 2018, 10 (03) : 77 - 80
[5] ESTIMATING LOWER HARDWARE BOUNDS IN HIGH-LEVEL SYNTHESIS
WEHN, N
GLESNER, M
VIELHAUER, C
VLSI 93, 1994, 42 : 261 - 270
[6] A HIGH-LEVEL LANGUAGE FOR DESIGN AND MODELING OF HARDWARE
NAVABI, Z
JOURNAL OF SYSTEMS AND SOFTWARE, 1992, 18 (01) : 5 - 18
[7] PICO-NPA: High-level synthesis of nonprogrammable hardware accelerators
Schreiber, R
Aditya, S
Mahlke, S
Kathail, V
Rau, BR
Cronquist, D
Sivaraman, M
JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2002, 31 (02): : 127 - 142
[8] PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators
Robert Schreiber
Shail Aditya
Scott Mahlke
Vinod Kathail
B. Ramakrishna Rau
Darren Cronquist
Mukund Sivaraman
Journal of VLSI signal processing systems for signal, image and video technology, 2002, 31 : 127 - 142
[9] COSMOS: Coordination of High-Level Synthesis and Memory Optimization for Hardware Accelerators
Piccolboni, Luca
Mantovani, Paolo
Di Guglielmo, Giuseppe
Carloni, Luca P.
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2017, 16
[10] Register Allocation for High-Level Synthesis of Hardware Accelerators Targeting FPGAs
Hempel, Gerald
Hoyer, Jan
Pionteck, Thilo
Hochberger, Christian
2013 8TH INTERNATIONAL WORKSHOP ON RECONFIGURABLE AND COMMUNICATION-CENTRIC SYSTEMS-ON-CHIP (RECOSOC), 2013,

← 1 2 3 4 5 →