Dynamic Neural Network to Enable Run-Time Trade-off between Accuracy and Latency

被引：0

作者：

Yang, Li ^{[1
]}

Fan, Deliang ^{[1
]}

机构：

[1] Arizona State Univ, Tempe, AZ 85281 USA

来源：

2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC) | 2021年

基金：

美国国家科学基金会;

关键词：

dynamic neural networks;

D O I：

10.1145/3394885.3431628

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

To deploy powerful deep neural network (DNN) into smart, but resource limited IoT devices, many prior works have been proposed to compress DNN to reduce the network size and computation complexity with negligible accuracy degradation, such as weight quantization, network pruning, convolution decomposition, etc. However, by utilizing conventional DNN compression methods, a smaller, but fixed, network is generated from a relative large background model to achieve resource limited hardware acceleration. However, such optimization lacks the ability to adjust its structure in real-time to adapt for a dynamic computing hardware resource allocation and workloads. In this paper, we mainly review our two prior works [13, 15] to tackle this challenge, discussing how to construct a dynamic DNN by means of either uniform or non-uniform sub-nets generation methods. Moreover, to generate multiple non-uniform sub-nets, [15] needs to fully retrain the background model for each sub-net individually, named as multi-path method. To reduce the training cost, in this work, we further propose a single-path sub-nets generation method that can sample multiple sub-nets in different epochs within one training round. The constructed dynamic DNN, consisting of multiple sub-nets, provides the ability to run-time trade-off the inference accuracy and latency according to hardware resources and environment requirements. In the end, we study the the dynamic DNNs with different sub-nets generation methods on both CIFAR-10 and ImageNet dataset. We also present the run-time tuning of accuracy and latency on both GPU and CPU.

引用

页码：587 / 592

页数：6

共 50 条

[1] Processing-in-Memory Accelerator for Dynamic Neural Network with Run-Time Tuning of Accuracy, Power and Latency
Yang, Li
He, Zhezhi
Angizi, Shaahin
Fan, Deliang
2020 IEEE 33RD INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC), 2020, : 117 - 122
[2] Run-Time management of energy-performance trade-off in Optical Network-on-Chip
Luo, Jiating
Van-Dung Pham
Killian, Cedric
Chillet, Daniel
O'Connor, Ian
Sentieys, Olivier
Le Beux, Sebastien
2018 XXXIII CONFERENCE ON DESIGN OF CIRCUITS AND INTEGRATED SYSTEMS (DCIS), 2018,
[3] Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition
Chris Ellis
Syed Zain Masood
Marshall F. Tappen
Joseph J. LaViola
Rahul Sukthankar
International Journal of Computer Vision, 2013, 101 : 420 - 436
[4] Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition
Ellis, Chris
Masood, Syed Zain
Tappen, Marshall F.
LaViola, Joseph J., Jr.
Sukthankar, Rahul
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2013, 101 (03) : 420 - 436
[5] Intelligent resource sharing to enable quality of service for network clients: the trade-off between accuracy and complexity
da Costa, Luis Antonio L. F.
Kunst, Rafael
de Freitas, Edison Pignaton
COMPUTING, 2022, 104 (05) : 1219 - 1231
[6] Intelligent resource sharing to enable quality of service for network clients: the trade-off between accuracy and complexity
Luis Antonio L. F. da Costa
Rafael Kunst
Edison Pignaton de Freitas
Computing, 2022, 104 : 1219 - 1231
[7] Prediction-Guided Performance-Energy Trade-off with Continuous Run-Time Adaptation
Song, Taejoon
Lo, Daniel
Suh, G. Edward
ISLPED '16: PROCEEDINGS OF THE 2016 INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, 2016, : 224 - 229
[8] Run-time versus compile-time instruction scheduling in superscalar (RISC) processors: Performance and trade-off
Leung, A
Palem, KV
Ungureanu, C
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1997, 45 (01) : 13 - 28
[9] Analyzing the Energy-Latency-Area-Accuracy Trade-off Across Contemporary Neural Networks
Jain, Vikram
Mei, Linyan
Verhelst, Marian
2021 IEEE 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS), 2021,
[10] Trade-off between accuracy and tractability of Network Calculus in FIFO networks
Bouillard, Anne
PERFORMANCE EVALUATION, 2022, 153

← 1 2 3 4 5 →