Dynamic Neural Network to Enable Run-Time Trade-off between Accuracy and Latency

被引:0
|
作者
Yang, Li [1 ]
Fan, Deliang [1 ]
机构
[1] Arizona State Univ, Tempe, AZ 85281 USA
基金
美国国家科学基金会;
关键词
dynamic neural networks;
D O I
10.1145/3394885.3431628
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To deploy powerful deep neural network (DNN) into smart, but resource limited IoT devices, many prior works have been proposed to compress DNN to reduce the network size and computation complexity with negligible accuracy degradation, such as weight quantization, network pruning, convolution decomposition, etc. However, by utilizing conventional DNN compression methods, a smaller, but fixed, network is generated from a relative large background model to achieve resource limited hardware acceleration. However, such optimization lacks the ability to adjust its structure in real-time to adapt for a dynamic computing hardware resource allocation and workloads. In this paper, we mainly review our two prior works [13, 15] to tackle this challenge, discussing how to construct a dynamic DNN by means of either uniform or non-uniform sub-nets generation methods. Moreover, to generate multiple non-uniform sub-nets, [15] needs to fully retrain the background model for each sub-net individually, named as multi-path method. To reduce the training cost, in this work, we further propose a single-path sub-nets generation method that can sample multiple sub-nets in different epochs within one training round. The constructed dynamic DNN, consisting of multiple sub-nets, provides the ability to run-time trade-off the inference accuracy and latency according to hardware resources and environment requirements. In the end, we study the the dynamic DNNs with different sub-nets generation methods on both CIFAR-10 and ImageNet dataset. We also present the run-time tuning of accuracy and latency on both GPU and CPU.
引用
收藏
页码:587 / 592
页数:6
相关论文
共 50 条
  • [1] Processing-in-Memory Accelerator for Dynamic Neural Network with Run-Time Tuning of Accuracy, Power and Latency
    Yang, Li
    He, Zhezhi
    Angizi, Shaahin
    Fan, Deliang
    2020 IEEE 33RD INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC), 2020, : 117 - 122
  • [2] Run-Time management of energy-performance trade-off in Optical Network-on-Chip
    Luo, Jiating
    Van-Dung Pham
    Killian, Cedric
    Chillet, Daniel
    O'Connor, Ian
    Sentieys, Olivier
    Le Beux, Sebastien
    2018 XXXIII CONFERENCE ON DESIGN OF CIRCUITS AND INTEGRATED SYSTEMS (DCIS), 2018,
  • [3] Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition
    Chris Ellis
    Syed Zain Masood
    Marshall F. Tappen
    Joseph J. LaViola
    Rahul Sukthankar
    International Journal of Computer Vision, 2013, 101 : 420 - 436
  • [4] Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition
    Ellis, Chris
    Masood, Syed Zain
    Tappen, Marshall F.
    LaViola, Joseph J., Jr.
    Sukthankar, Rahul
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2013, 101 (03) : 420 - 436
  • [5] Intelligent resource sharing to enable quality of service for network clients: the trade-off between accuracy and complexity
    da Costa, Luis Antonio L. F.
    Kunst, Rafael
    de Freitas, Edison Pignaton
    COMPUTING, 2022, 104 (05) : 1219 - 1231
  • [6] Intelligent resource sharing to enable quality of service for network clients: the trade-off between accuracy and complexity
    Luis Antonio L. F. da Costa
    Rafael Kunst
    Edison Pignaton de Freitas
    Computing, 2022, 104 : 1219 - 1231
  • [7] Prediction-Guided Performance-Energy Trade-off with Continuous Run-Time Adaptation
    Song, Taejoon
    Lo, Daniel
    Suh, G. Edward
    ISLPED '16: PROCEEDINGS OF THE 2016 INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, 2016, : 224 - 229
  • [8] Run-time versus compile-time instruction scheduling in superscalar (RISC) processors: Performance and trade-off
    Leung, A
    Palem, KV
    Ungureanu, C
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1997, 45 (01) : 13 - 28
  • [9] Analyzing the Energy-Latency-Area-Accuracy Trade-off Across Contemporary Neural Networks
    Jain, Vikram
    Mei, Linyan
    Verhelst, Marian
    2021 IEEE 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS), 2021,
  • [10] Trade-off between accuracy and tractability of Network Calculus in FIFO networks
    Bouillard, Anne
    PERFORMANCE EVALUATION, 2022, 153