High Utilization Energy-Aware Real-Time Inference Deep Convolutional Neural Network Accelerator

被引:4
|
作者
Lin, Kuan-Ting [1 ]
Chiu, Ching-Te [1 ]
Chang, Jheng-Yi [2 ]
Hsiao, Shan-Chien [1 ]
机构
[1] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu, Taiwan
[2] Natl Tsing Hua Univ, Inst Commun Engn, Hsinchu, Taiwan
关键词
CNN; Accelerator; Energy-Aware; Real-Time Inference; High Utilization;
D O I
10.1109/ISCAS51556.2021.9401526
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep convolution Neural Network (DCNN) has been widely used in computer vision tasks. However, for edge device, even then inference has too large computational complexity and data access amount. Due to the mentioned shortcomings, the inference latency of state-of-the-art models are still impractical for real-world applications. In this paper, we proposed a high utilization energy-aware real-time inference deep convolutional neural network accelerator, which outperforms the current accelerators. First, we use 1x1 size convolution kernels as the smallest unit of the computing unit. And we design suitable computing unit for different models based on the requirement of each model. Second, we use Reuse Feature SRAM to store the output of current layer in the chip and use as the input of the next layer. Moreover, we import Output Reuse Strategy and Ring Stream Data flow not only to expand the reuse rate of data in the chip but to reduce the amount of data exchange between chips and DRAM. Finally, we present On-fly Pooling Module to let the calculation of the Pooling layer to be completed directly in the chip. With the aid of the proposed method in this paper, the implemented CNN acceleration chip has extreme high hardware utilization rate. We reduce a generous amount of data transfer on the specific module, ECNN [1]. Compared to the methods without reuse strategy, we can reduce 533 times of data access amount. At the same time, we have enough computing power to perform real-time execution of the existing image classification model, VGG16 [2] and MobileNet [3]. Compared with the design in [4], we can speed up 7.52 times and have 1.92x energy efficiency.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] AN ENERGY-AWARE BIT-SERIAL STREAMING DEEP CONVOLUTIONAL NEURAL NETWORK ACCELERATOR
    Hsu, Lien-Chih
    Chiu, Ching-Te
    Lin, Kuan-Ting
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 4609 - 4613
  • [2] ESSA: An energy-Aware bit-Serial streaming deep convolutional neural network accelerator
    Hsu, Lien-Chih
    Chiu, Ching-Te
    Lin, Kuan-Ting
    Chou, Hsing-Huan
    Pu, Yen-Yu
    JOURNAL OF SYSTEMS ARCHITECTURE, 2020, 111
  • [3] AxoNN: Energy-Aware Execution of Neural Network Inference on Multi-Accelerator Heterogeneous SoCs
    Dagli, Ismet
    Cieslewicz, Alexander
    McClurg, Jedidiah
    Belviranli, Mehmet E.
    PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 1069 - 1074
  • [4] EALI: Energy-aware layer-level scheduling for convolutional neural network inference services on GPUs
    Yao, Chunrong
    Liu, Wantao
    Liu, Zhibing
    Yan, Longchuan
    Hu, Songlin
    Tang, Weiqing
    NEUROCOMPUTING, 2022, 507 : 265 - 281
  • [5] HEART: A Heterogeneous Energy-Aware Real-Time scheduler
    Moulik, Sanjay
    Devaraj, Rajesh
    Sarkar, Arnab
    2019 32ND INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2019 18TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (VLSID), 2019, : 476 - 481
  • [6] Speed modulation in energy-aware real-time systems
    Bini, E
    Buttazzo, G
    Lipari, G
    17th Euromicro Conference on Real-Time Systems, Proceedings, 2005, : 3 - 10
  • [7] A Smart Deep Convolutional Neural Network for Real-Time Surface Inspection
    Passos, Adriano G.
    Cousseau, Tiago
    Luersen, Marco A.
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2022, 41 (02): : 583 - 593
  • [8] HEARS: A heterogeneous energy-aware real-time scheduler
    Moulik, Sanjay
    Chaudhary, Rishabh
    Das, Zinea
    MICROPROCESSORS AND MICROSYSTEMS, 2020, 72
  • [9] Energy-Aware Real-Time Scheduling in the Linux Kernel
    Scordino, Claudio
    Abeni, Luca
    Lelli, Juri
    33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2018, : 601 - 608
  • [10] Energy-Aware Scheduling for Real-Time Systems: A Survey
    Bambagini, Mario
    Marinoni, Mauro
    Aydin, Hakan
    Buttazzo, Giorgio
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2016, 15 (01)