High Utilization Energy-Aware Real-Time Inference Deep Convolutional Neural Network Accelerator

被引:4
|
作者
Lin, Kuan-Ting [1 ]
Chiu, Ching-Te [1 ]
Chang, Jheng-Yi [2 ]
Hsiao, Shan-Chien [1 ]
机构
[1] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu, Taiwan
[2] Natl Tsing Hua Univ, Inst Commun Engn, Hsinchu, Taiwan
关键词
CNN; Accelerator; Energy-Aware; Real-Time Inference; High Utilization;
D O I
10.1109/ISCAS51556.2021.9401526
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep convolution Neural Network (DCNN) has been widely used in computer vision tasks. However, for edge device, even then inference has too large computational complexity and data access amount. Due to the mentioned shortcomings, the inference latency of state-of-the-art models are still impractical for real-world applications. In this paper, we proposed a high utilization energy-aware real-time inference deep convolutional neural network accelerator, which outperforms the current accelerators. First, we use 1x1 size convolution kernels as the smallest unit of the computing unit. And we design suitable computing unit for different models based on the requirement of each model. Second, we use Reuse Feature SRAM to store the output of current layer in the chip and use as the input of the next layer. Moreover, we import Output Reuse Strategy and Ring Stream Data flow not only to expand the reuse rate of data in the chip but to reduce the amount of data exchange between chips and DRAM. Finally, we present On-fly Pooling Module to let the calculation of the Pooling layer to be completed directly in the chip. With the aid of the proposed method in this paper, the implemented CNN acceleration chip has extreme high hardware utilization rate. We reduce a generous amount of data transfer on the specific module, ECNN [1]. Compared to the methods without reuse strategy, we can reduce 533 times of data access amount. At the same time, we have enough computing power to perform real-time execution of the existing image classification model, VGG16 [2] and MobileNet [3]. Compared with the design in [4], we can speed up 7.52 times and have 1.92x energy efficiency.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Energy-Aware and Real-time Service Management In Cloud Computing
    Chawarut, Worachat
    Woraphon, Lilakiatsakun
    2013 10TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON), 2013,
  • [22] Energy-aware strategies in real-time systems for autonomous robots
    Buttazzo, G
    Marinoni, M
    Guidi, G
    COMPUTER AND INFORMATION SCIENCES - ISCIS 2004, PROCEEDINGS, 2004, 3280 : 845 - 854
  • [23] Energy-aware traffic shaping for wireless real-time applications
    Poellabauer, C
    Schwan, K
    RTAS 2004: 10TH IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2004, : 48 - 55
  • [24] Energy-aware adaptive checkpointing in embedded real-time systems
    Zhang, Y
    Chakrabarty, K
    DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, PROCEEDINGS, 2003, : 918 - 923
  • [25] Real-time task scheduling for energy-aware embedded systems
    Swaminathan, V
    Chakrabarty, K
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2001, 338 (06): : 729 - 750
  • [26] Real-Time Fixed-Point Hardware Accelerator of Convolutional Neural Network on FPGA Based
    Ozkilbac, Bahadir
    Ozbek, Ibrahim Yucel
    Karacali, Tevhit
    5TH INTERNATIONAL CONFERENCE ON COMPUTING AND INFORMATICS (ICCI 2022), 2022, : 1 - 5
  • [27] A convolutional neural network accelerator for real-time underwater image recognition of autonomous underwater vehicle
    Zhao, Wanting
    Qi, Hong
    Jiang, Yu
    Wang, Chong
    Wei, Fenglin
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART I-JOURNAL OF SYSTEMS AND CONTROL ENGINEERING, 2021, 235 (10) : 1839 - 1848
  • [28] ENOS: Energy-Aware Network Operator Search in Deep Neural Networks
    Nasrin, Shamma
    Shylendra, Ahish
    Darabi, Nastaran
    Tulabandhula, Theja
    Gomes, Wilfred
    Chakrabarty, Ankush
    Trivedi, Amit Ranjan
    IEEE ACCESS, 2022, 10 : 81447 - 81457
  • [29] Real-time inference in a VLSI spiking neural network
    Corneil, Dane
    Sonnleithner, Daniel
    Neftci, Emre
    Chicca, Elisabetta
    Cook, Matthew
    Indiveri, Giacomo
    Douglas, Rodney
    2012 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 2012), 2012, : 2425 - 2428
  • [30] Tiny Fusion: Tiny Deep Convolutional Neural Network for Real-time Image Fusion
    Wang, Jixiao
    Li, Yang
    Miao, Zhuang
    Wang, Jiabao
    Zhao, Xun
    2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 384 - 389