Heterogeneous Scheduling of Deep Neural Networks for Low-power Real-time Designs

被引:10
|
作者
Shea, Colin [1 ]
Mohsenin, Tinoosh [1 ]
机构
[1] Univ Maryland Baltimore Cty, 1000 Hilltop Circle, Catonsville, MD 21250 USA
关键词
Machine learning; real-time; scheduling; co-design; hardware; software; FPGA;
D O I
10.1145/3358699
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks have become the readiest answer to a range of application challenges including image recognition, stock analysis, natural language processing, and biomedical applications such as seizure detection. All while outperforming prior leading solutions that relied heavily on hand-engineered techniques. However, deployment of these neural networks often requires high-computational and memory-intensive solutions. These requirements make it challenging to deploy Deep Neural Networks (DNNs) in embedded, real-time low-power applications where classic architectures, GPUs and CPUs, still impose significant power burden. Systems-on-Chip (SoC) with Field-programmable Gate Arrays (FPGAs) can be used to improve performance and allow more fine-grain control of resources than CPUs or GPUs, but it is difficult to find the optimal balance between hardware and software to improve DNN efficiency. In the current research literature there have been few proposed solutions to address optimizing hardware and software deployments of DNNs in embedded low-power systems. To address the computation resource restriction and low-power needs for deploying these networks, we describe and implement a domain-specific metric model for optimizing task deployment on differing platforms, hardware and software. Next, we propose a DNN hardware accelerator called Scalable Low-power Accelerator for real-time deep neural Networks (SCALENet) that includes multithreaded software workers. Finally, we propose a heterogeneous aware scheduler that uses the DNN-specific metric models and the SCALENet accelerator to allocate a task to a resource based on solving a numerical cost for a series of domain objectives. To demonstrate the applicability of our contribution, we deploy nine modern deep network architectures, each containing a different number of parameters within the context of two different neural network applications: image processing and biomedical seizure detection. Utilizing the metric modeling techniques integrated into the heterogeneous aware scheduler and the SCALENet accelerator, we demonstrate the ability to meet computational requirements, adapt to multiple architectures, and lower power by providing an optimized task to resource allocation. Our heterogeneous aware scheduler improves power saving by decreasing power consumption by 10% of the total system power, does not affect the accuracy of the networks, and still meets the real-time deadlines. We demonstrate the ability to achieve parity with or exceed the energy efficiency of NVIDIA GPUs when evaluated against Jetson TK1 with embedded GPU SoC and with a 4x power savings in a power envelope of 2.0W. When compared to existing FPGA-based accelerators, SCALENet's accelerator and heterogeneous aware scheduler achieves a 4.8x improvement in energy efficiency.
引用
收藏
页数:31
相关论文
共 50 条
  • [1] Low-Power Real-Time Sequential Processing with Spiking Neural Networks
    Liyanagedera, Chamika Mihiranga
    Nagaraj, Manish
    Ponghiran, Wachirawit
    Roy, Kaushik
    2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
  • [2] Mobility-Aware Real-Time Scheduling for Low-Power Wireless Networks
    Dezfouli, Behnam
    Radi, Marjan
    Chipara, Octav
    IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
  • [3] Low-power Dynamic Scheduling Algorithm For Real-time Multiprocessor Systems
    Ko, Se-Jin
    Kim, Ki-Young
    Kim, Seok-Yoon
    ISOCC: 2008 INTERNATIONAL SOC DESIGN CONFERENCE, VOLS 1-3, 2008, : 516 - 519
  • [4] Low-power scheduling algorithm for mixed task in real-time system
    Zhang, Yi-Wen
    Guo, Rui-Feng
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2015, 45 (01): : 251 - 266
  • [5] Real-Time Communication in Low-Power Mobile Wireless Networks
    Dezfouli, Behnam
    Radi, Marjan
    Chipara, Octav
    2016 13TH IEEE ANNUAL CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE (CCNC), 2016,
  • [6] SCALENet: A SCalable Low power AccELerator for Real-time Embedded Deep Neural Networks
    Shea, Colin
    Page, Adam
    Mohsenin, Tinoosh
    PROCEEDINGS OF THE 2018 GREAT LAKES SYMPOSIUM ON VLSI (GLSVLSI'18), 2018, : 129 - 134
  • [7] A Reconfigurable Streaming Processor for Real-Time Low-Power Execution of Convolutional Neural Networks at the Edge
    Sanchez, Justin
    Soltani, Nasim
    Kulkarni, Pratik
    Chamarthi, Ramachandra Vikas
    Tabkhi, Hamed
    EDGE COMPUTING - EDGE 2018, 2018, 10973 : 49 - 64
  • [8] Low-Power FPGA-Based Spiking Neural Networks for Real-Time Decoding of Intracortical Neural Activity
    Martis, Luca
    Leone, Gianluca
    Raffo, Luigi
    Meloni, Paolo
    IEEE SENSORS JOURNAL, 2024, 24 (24) : 42448 - 42459
  • [9] Artificial neural networks for real-time scheduling
    Nureldin, HM
    O'Connor, RF
    Duffill, AW
    ADVANCES IN MANUFACTURING TECHNOLOGY XII, 1998, : 251 - 256
  • [10] Device-centric low-power scheduling for real-time embedded systems
    Hsiung, PA
    Kao, HC
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2005, 15 (02) : 461 - 466