EENet: Energy Efficient Neural Networks with Run-time Power Management

被引：0

作者：

Li, Xiangjie ^{[1
]}

Shen, Yingtao ^{[1
]}

Zou, An ^{[1
]}

Ma, Yehan ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China

来源：

2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC | 2023年

关键词：

Neural Networks; Early Exit; Energy Efficiency; Inference Time; Feedback Control;

D O I：

10.1109/DAC56929.2023.10247701

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep learning approaches, such as convolution neural networks (CNNs), have achieved tremendous success in versatile applications. However, one of the challenges to deploy the deep learning models on resource-constrained systems is its huge energy cost. As a dynamic inference approach, early exit adds exiting layers to the networks, which can terminate the inference earlier with accurate results to save energy. The current passive decision-making for energy regulation of early exit cannot adapt to ongoing inference status, varying inference workloads, and timing constraints, let alone guide the reasonable configuration of the computing platforms alongside the inference proceeds for potential energy saving. In this paper, we propose an Energy Efficient Neural Networks (EENet), which introduces a plug-in module to the state-of-the-art networks by incorporating run-time power management. Within each inference, we establish prediction of where the network will exit and adjust computing configurations (i.e., frequency and voltage) accordingly over a small timescale. Considering multiple inferences over a large timescale, we provide frequency and voltage calibration advice, given inference workloads and timing constraints. Finally, the dynamic voltage and frequency scaling (DVFS) governor configures voltage and frequency to execute the network according to the prediction and calibration. Extensive experimental results demonstrate that EENet achieves up to 63.8% energy-saving compared with classic deep learning networks and 21.5% energy-saving compared with the early exit under state-of-the-art exiting strategies, together with improved timing performance.

引用

页数：6

共 50 条

[31] PARMA: Parallelization-Aware Run-Time Management for Energy-Efficient Many-Core Systems
Al-hayanni, Mohammed A. Noaman
Rafiev, Ashur
Xia, Fei
Shafik, Rishad
Romanovsky, Alexander
Yakovlev, Alex
IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (10) : 1507 - 1518
[32] Run-Time Thermal Management for Lifetime Optimization in Low-Power Designs
Rossi, Daniele
Tenentes, Vasileios
ELECTRONICS, 2022, 11 (03)
[33] Run-time Power-gating in Caches of GPUs for Leakage Energy Savings
Wang, Yue
Roy, Soumyaroop
Ranganathan, Nagarajan
DESIGN, AUTOMATION & TEST IN EUROPE (DATE 2012), 2012, : 300 - 303
[34] Server Power Modeling for Run-time Energy Optimization of Cloud Computing Facilities
Arroba, Patricia
Risco-Martin, Jose L.
Zapater, Marina
Moya, Jose M.
Ayala, Jose L.
Olcoz, Katzalin
6TH INTERNATIONAL CONFERENCE ON SUSTAINABILITY IN ENERGY AND BUILDINGS, 2014, 62 : 401 - 410
[35] A run-time generic decision framework for power and performance management on mobile devices
Peres, Martin
Chalouf, Mohamed Aymen
Krief, Francine
2014 IEEE 11TH INTL CONF ON UBIQUITOUS INTELLIGENCE AND COMPUTING AND 2014 IEEE 11TH INTL CONF ON AUTONOMIC AND TRUSTED COMPUTING AND 2014 IEEE 14TH INTL CONF ON SCALABLE COMPUTING AND COMMUNICATIONS AND ITS ASSOCIATED WORKSHOPS, 2014, : 72 - 79
[36] Deep Quantization of Graph Neural Networks with Run-Time Hardware-Aware Training
Hansson, Olle
Grailoo, Mahdieh
Gustafsson, Oscar
Nunez-Yanez, Jose
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2024, 14553 LNCS : 33 - 47
[37] Run-time Non-uniform Quantization for Dynamic Neural Networks in Wireless Communication
Allwin, Priscilla Sharon
Gomony, Manil Dev
Geilen, Marc
29TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2024, 2024, : 915 - 920
[38] Energy Reduction with Run-Time Partial Reconfiguration
Liu, Shaoshan
Pittman, Richard Neil
Forin, Alessandro
FPGA 10, 2010, : 292 - 292
[39] Deep Quantization of Graph Neural Networks with Run-Time Hardware-Aware Training
Hansson, Olle
Grailoo, Mahdieh
Gustafsson, Oscar
Nunez-Yanez, Jose
APPLIED RECONFIGURABLE COMPUTING. ARCHITECTURES, TOOLS, AND APPLICATIONS, ARC 2024, 2024, 14553 : 33 - 47
[40] A Fuzzy Logic Based Power-Efficient Run-Time Reconfigurable Multicore System
Najam, Shaheryar
Qadri, Muhammad Yasir
Najam, Zohaib
Ahmed, Jameel
Qadri, Nadia N.
CHINESE JOURNAL OF ELECTRONICS, 2018, 27 (03) : 549 - 555

← 1 2 3 4 5 →