End-to-End Deep Policy Feedback-Based Reinforcement Learning Method for Quantization in DNNs

被引：1

作者：

Babu, R. Logesh ^{[1
]}

Gurumoorthy, Sasikumar ^{[2
]}

Parameshachari, B. D. ^{[3
]}

Nelson, S. Christalin ^{[4
]}

Hua, Qiaozhi ^{[5
]}

机构：

[1] Madanapalle Inst Technol & Sci, Dept Comp Sci & Engn, Chittoor 517325, Andhra Pradesh, India

[2] Jerusalem Coll Engn, Dept Comp Sci & Engn, Chennai 600100, Tamil Nadu, India

[3] GSSS Inst Engn & Technol Women, Dept Telecommun Engn, Mysuru 570011, Karnataka, India

[4] Univ Petr & Energy Studies UPES, Sch Comp Sci, Dept Syst Cluster, Dehra Dun 248007, Uttarakhand, India

[5] Hubei Univ Arts & Sci, Sch Comp, Xiangyang 441000, Hubei, Peoples R China

来源：

JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS | 2022年 / 31卷 / 13期

关键词：

Constrained embedded systems; deep neural networks; long short-term memory network; policy feedback; proximal policy optimization technique; reinforcement learning method; NEURAL ARCHITECTURE SEARCH; EFFICIENCY; ACCURACY; NETWORKS;

D O I：

10.1142/S0218126622502322

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the resource-constrained embedded systems, the designing of efficient deep neural networks is a challenging process, due to diversity in the artificial intelligence applications. The quantization in deep neural networks superiorly diminishes the storage and computational time by reducing the bit-width of networks encoding. In order to highlight the problem of accuracy loss, the quantization levels are automatically discovered using Policy Feedback-based Reinforcement Learning Method (PF-RELEQ). In this paper, the Proximal Policy Optimization with Policy Feedback (PPO-PF) technique is proposed to determine the best design decisions by choosing the optimum hyper-parameters. In order to enhance the sensitivity of the value function to the change of policy and to improve the accuracy of value estimation at the early learning stage, a policy update method is devised based on the clipped discount factor. In addition, specifically the loss functions of policy satisfy the unbiased estimation of the trust region. The proposed PF-RELEQ effectively balances quality and speed compared to other deep learning methods like ResNet-1202, ResNet-32, ResNet-110, GoogLeNet and AlexNet. The experimental analysis showed that PF-RELEQ achieved 20% computational work-load reduction compared to the existing deep learning methods on ImageNet, CIFAR-10, CIFAR-100 and tomato leaf disease datasets and achieved approximately 2% of improvisation in the validation accuracy. Additionally, the PF-RELEQ needs only 0.55 Graphics Processing Unit on an NVIDIA GTX-1080Ti to develop DNNs that delivers better accuracy improvement with fewer cycle counts for image classification.

引用

页数：25

共 50 条

[21] ORACLE: End-to-End Model Based Reinforcement Learning
Andersen, Per-Arne
Goodwin, Morten
Granmo, Ole-Christoffer
ARTIFICIAL INTELLIGENCE XXXVIII, 2021, 13101 : 44 - 57
[22] A Deep Learning-Based End-To-End CT Reconstruction Method
Lu, K.
Ren, L.
Yin, F.
MEDICAL PHYSICS, 2020, 47 (06) : E507 - E508
[23] Off-policy model-based end-to-end safe reinforcement learning
Kanso, Soha
Jha, Mayank Shekhar
Theilliol, Didier
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024, 34 (04) : 2806 - 2831
[24] Eagle: End-to-end Deep Reinforcement Learning based Autonomous Control of PTZ Cameras
Sandha, Sandeep Singh
Balaji, Bharathan
Garcia, Luis
Srivastava, Mani
PROCEEDINGS 8TH ACM/IEEE CONFERENCE ON INTERNET OF THINGS DESIGN AND IMPLEMENTATION, IOTDI 2023, 2023, : 144 - 157
[25] End-to-End Autonomous Navigation Based on Deep Reinforcement Learning with a Survival Penalty Function
Jeng, Shyr-Long
Chiang, Chienhsun
SENSORS, 2023, 23 (20)
[26] End-to-End Deep Reinforcement Learning for Image-Based UAV Autonomous Control
Zhao, Jiang
Sun, Jiaming
Cai, Zhihao
Wang, Longhong
Wang, Yingxun
APPLIED SCIENCES-BASEL, 2021, 11 (18):
[27] An End-to-End Detection Method for WebShell with Deep Learning
Qi, Longchen
Kong, Rui
Lu, Yang
Zhuang, Honglin
2018 EIGHTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2018), 2018, : 660 - 665
[28] End-to-End Learning of Deep Visuomotor Policy for Needle Picking
Lin, Hongbin
Li, Bin
Chu, Xiangyu
Dou, Qi
Liu, Yunhui
Au, Kwok Wai Samuel
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 8487 - 8494
[29] Deep Reinforcement Learning for End-to-End Network Slicing: Challenges and Solutions
Liu, Qiang
Choi, Nakjung
Han, Tao
IEEE NETWORK, 2023, 37 (02): : 222 - 228
[30] End-to-end sensorimotor control problems of AUVs with deep reinforcement learning
Wu, Hui
Song, Shiji
Hsu, Yachu
You, Keyou
Wu, Cheng
2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 5869 - 5874

← 1 2 3 4 5 →