Dynamic Neural Network to Enable Run-Time Trade-off between Accuracy and Latency

被引：0

作者：

Yang, Li ^{[1
]}

Fan, Deliang ^{[1
]}

机构：

[1] Arizona State Univ, Tempe, AZ 85281 USA

来源：

2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC) | 2021年

基金：

美国国家科学基金会;

关键词：

dynamic neural networks;

D O I：

10.1145/3394885.3431628

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

To deploy powerful deep neural network (DNN) into smart, but resource limited IoT devices, many prior works have been proposed to compress DNN to reduce the network size and computation complexity with negligible accuracy degradation, such as weight quantization, network pruning, convolution decomposition, etc. However, by utilizing conventional DNN compression methods, a smaller, but fixed, network is generated from a relative large background model to achieve resource limited hardware acceleration. However, such optimization lacks the ability to adjust its structure in real-time to adapt for a dynamic computing hardware resource allocation and workloads. In this paper, we mainly review our two prior works [13, 15] to tackle this challenge, discussing how to construct a dynamic DNN by means of either uniform or non-uniform sub-nets generation methods. Moreover, to generate multiple non-uniform sub-nets, [15] needs to fully retrain the background model for each sub-net individually, named as multi-path method. To reduce the training cost, in this work, we further propose a single-path sub-nets generation method that can sample multiple sub-nets in different epochs within one training round. The constructed dynamic DNN, consisting of multiple sub-nets, provides the ability to run-time trade-off the inference accuracy and latency according to hardware resources and environment requirements. In the end, we study the the dynamic DNNs with different sub-nets generation methods on both CIFAR-10 and ImageNet dataset. We also present the run-time tuning of accuracy and latency on both GPU and CPU.

引用

页码：587 / 592

页数：6

共 50 条

[21] Trade-off between the sampling rate and the data accuracy
Zhang, Chun
Liu, Xue
2008 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2008, : 2631 - +
[22] On the Trade-off Between Accuracy and Delay in UWB Navigation
Garcia, Gabriel E.
Muppirisetty, L. Srikar
Wymeersch, Henk
IEEE COMMUNICATIONS LETTERS, 2013, 17 (01) : 39 - 42
[23] FORECASTS AND ACTUALS - THE TRADE-OFF BETWEEN TIMELINESS AND ACCURACY
MCNEES, SK
INTERNATIONAL JOURNAL OF FORECASTING, 1989, 5 (03) : 409 - 416
[24] Triangular Trade-off between Robustness, Accuracy, and Fairness in Deep Neural Networks: A Survey
Li, Jingyang
Li, Guoqiang
ACM COMPUTING SURVEYS, 2025, 57 (06)
[25] Trade-off between Robustness and Accuracy of Vision Transformers
Li, Yanxi
Xu, Chang
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7558 - 7568
[26] Theoretically Principled Trade-off between Robustness and Accuracy
Zhang, Hongyang
Yu, Yaodong
Jiao, Jiantao
Xing, Eric P.
El Ghaoui, Laurent
Jordan, Michael I.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[27] SPEECHNAS: TOWARDS BETTER TRADE-OFF BETWEEN LATENCY AND ACCURACY FOR LARGE-SCALE SPEAKER VERIFICATION
Zhu, Wentao
Kong, Tianlong
Lu, Shun
Li, Jixiang
Zhang, Dawei
Deng, Feng
Wang, Xiaorui
Yang, Sen
Liu, Ji
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 1102 - 1109
[28] Mitigating the Latency-Accuracy Trade-off in Mobile Data Analytics Systems
Iyer, Anand Padmanabha
Li, Li Erran
Chowdhury, Mosharaf
Stoica, Ion
MOBICOM'18: PROCEEDINGS OF THE 24TH ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, 2018, : 513 - 528
[29] A Trade-off Analysis of Latency, Accuracy, and Energy in Task Offloading Strategies for UAVs
Erbayat, Egemen
Zou, Rujia
Wei, Xianglin
Venkataramani, Guru
Subramaniam, Suresh
2024 IEEE CLOUD SUMMIT, CLOUD SUMMIT 2024, 2024, : 48 - 53
[30] Dynamic Energy-Accuracy Trade-off Using Stochastic Computing in Deep Neural Networks
Kim, Kyounghoon
Kim, Jungki
Yu, Joonsang
Seo, Jungwoo
Lee, Jongeun
Choi, Kiyoung
2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2016,

← 1 2 3 4 5 →