Dynamic Neural Network to Enable Run-Time Trade-off between Accuracy and Latency

被引:0
|
作者
Yang, Li [1 ]
Fan, Deliang [1 ]
机构
[1] Arizona State Univ, Tempe, AZ 85281 USA
基金
美国国家科学基金会;
关键词
dynamic neural networks;
D O I
10.1145/3394885.3431628
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To deploy powerful deep neural network (DNN) into smart, but resource limited IoT devices, many prior works have been proposed to compress DNN to reduce the network size and computation complexity with negligible accuracy degradation, such as weight quantization, network pruning, convolution decomposition, etc. However, by utilizing conventional DNN compression methods, a smaller, but fixed, network is generated from a relative large background model to achieve resource limited hardware acceleration. However, such optimization lacks the ability to adjust its structure in real-time to adapt for a dynamic computing hardware resource allocation and workloads. In this paper, we mainly review our two prior works [13, 15] to tackle this challenge, discussing how to construct a dynamic DNN by means of either uniform or non-uniform sub-nets generation methods. Moreover, to generate multiple non-uniform sub-nets, [15] needs to fully retrain the background model for each sub-net individually, named as multi-path method. To reduce the training cost, in this work, we further propose a single-path sub-nets generation method that can sample multiple sub-nets in different epochs within one training round. The constructed dynamic DNN, consisting of multiple sub-nets, provides the ability to run-time trade-off the inference accuracy and latency according to hardware resources and environment requirements. In the end, we study the the dynamic DNNs with different sub-nets generation methods on both CIFAR-10 and ImageNet dataset. We also present the run-time tuning of accuracy and latency on both GPU and CPU.
引用
收藏
页码:587 / 592
页数:6
相关论文
共 50 条
  • [21] Trade-off between the sampling rate and the data accuracy
    Zhang, Chun
    Liu, Xue
    2008 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2008, : 2631 - +
  • [22] On the Trade-off Between Accuracy and Delay in UWB Navigation
    Garcia, Gabriel E.
    Muppirisetty, L. Srikar
    Wymeersch, Henk
    IEEE COMMUNICATIONS LETTERS, 2013, 17 (01) : 39 - 42
  • [24] Triangular Trade-off between Robustness, Accuracy, and Fairness in Deep Neural Networks: A Survey
    Li, Jingyang
    Li, Guoqiang
    ACM COMPUTING SURVEYS, 2025, 57 (06)
  • [25] Trade-off between Robustness and Accuracy of Vision Transformers
    Li, Yanxi
    Xu, Chang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7558 - 7568
  • [26] Theoretically Principled Trade-off between Robustness and Accuracy
    Zhang, Hongyang
    Yu, Yaodong
    Jiao, Jiantao
    Xing, Eric P.
    El Ghaoui, Laurent
    Jordan, Michael I.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [27] SPEECHNAS: TOWARDS BETTER TRADE-OFF BETWEEN LATENCY AND ACCURACY FOR LARGE-SCALE SPEAKER VERIFICATION
    Zhu, Wentao
    Kong, Tianlong
    Lu, Shun
    Li, Jixiang
    Zhang, Dawei
    Deng, Feng
    Wang, Xiaorui
    Yang, Sen
    Liu, Ji
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 1102 - 1109
  • [28] Mitigating the Latency-Accuracy Trade-off in Mobile Data Analytics Systems
    Iyer, Anand Padmanabha
    Li, Li Erran
    Chowdhury, Mosharaf
    Stoica, Ion
    MOBICOM'18: PROCEEDINGS OF THE 24TH ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, 2018, : 513 - 528
  • [29] A Trade-off Analysis of Latency, Accuracy, and Energy in Task Offloading Strategies for UAVs
    Erbayat, Egemen
    Zou, Rujia
    Wei, Xianglin
    Venkataramani, Guru
    Subramaniam, Suresh
    2024 IEEE CLOUD SUMMIT, CLOUD SUMMIT 2024, 2024, : 48 - 53
  • [30] Dynamic Energy-Accuracy Trade-off Using Stochastic Computing in Deep Neural Networks
    Kim, Kyounghoon
    Kim, Jungki
    Yu, Joonsang
    Seo, Jungwoo
    Lee, Jongeun
    Choi, Kiyoung
    2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2016,