End-to-End Deep Policy Feedback-Based Reinforcement Learning Method for Quantization in DNNs

被引：1

作者：

Babu, R. Logesh ^{[1
]}

Gurumoorthy, Sasikumar ^{[2
]}

Parameshachari, B. D. ^{[3
]}

Nelson, S. Christalin ^{[4
]}

Hua, Qiaozhi ^{[5
]}

机构：

[1] Madanapalle Inst Technol & Sci, Dept Comp Sci & Engn, Chittoor 517325, Andhra Pradesh, India

[2] Jerusalem Coll Engn, Dept Comp Sci & Engn, Chennai 600100, Tamil Nadu, India

[3] GSSS Inst Engn & Technol Women, Dept Telecommun Engn, Mysuru 570011, Karnataka, India

[4] Univ Petr & Energy Studies UPES, Sch Comp Sci, Dept Syst Cluster, Dehra Dun 248007, Uttarakhand, India

[5] Hubei Univ Arts & Sci, Sch Comp, Xiangyang 441000, Hubei, Peoples R China

来源：

JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS | 2022年 / 31卷 / 13期

关键词：

Constrained embedded systems; deep neural networks; long short-term memory network; policy feedback; proximal policy optimization technique; reinforcement learning method; NEURAL ARCHITECTURE SEARCH; EFFICIENCY; ACCURACY; NETWORKS;

D O I：

10.1142/S0218126622502322

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the resource-constrained embedded systems, the designing of efficient deep neural networks is a challenging process, due to diversity in the artificial intelligence applications. The quantization in deep neural networks superiorly diminishes the storage and computational time by reducing the bit-width of networks encoding. In order to highlight the problem of accuracy loss, the quantization levels are automatically discovered using Policy Feedback-based Reinforcement Learning Method (PF-RELEQ). In this paper, the Proximal Policy Optimization with Policy Feedback (PPO-PF) technique is proposed to determine the best design decisions by choosing the optimum hyper-parameters. In order to enhance the sensitivity of the value function to the change of policy and to improve the accuracy of value estimation at the early learning stage, a policy update method is devised based on the clipped discount factor. In addition, specifically the loss functions of policy satisfy the unbiased estimation of the trust region. The proposed PF-RELEQ effectively balances quality and speed compared to other deep learning methods like ResNet-1202, ResNet-32, ResNet-110, GoogLeNet and AlexNet. The experimental analysis showed that PF-RELEQ achieved 20% computational work-load reduction compared to the existing deep learning methods on ImageNet, CIFAR-10, CIFAR-100 and tomato leaf disease datasets and achieved approximately 2% of improvisation in the validation accuracy. Additionally, the PF-RELEQ needs only 0.55 Graphics Processing Unit on an NVIDIA GTX-1080Ti to develop DNNs that delivers better accuracy improvement with fewer cycle counts for image classification.

引用

页数：25

共 50 条

[1] End-to-end Deep Reinforcement Learning Based Coreference Resolution
Fei, Hongliang
Li, Xu
Li, Dingcheng
Li, Ping
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 660 - 665
[2] End-to-End AUV Local Motion Planning Method Based on Deep Reinforcement Learning
Lyu, Xi
Sun, Yushan
Wang, Lifeng
Tan, Jiehui
Zhang, Liwen
JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (09)
[3] Early Failure Detection of Deep End-to-End Control Policy by Reinforcement Learning
Lee, Keuntaek
Saigol, Kamil
Theodorou, Evangelos A.
2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 8543 - 8549
[4] End-to-End Autonomous Driving Decision Based on Deep Reinforcement Learning
Huang, Zhiqing
Zhang, Ji
Tian, Rui
Zhang, Yanxin
CONFERENCE PROCEEDINGS OF 2019 5TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS (ICCAR), 2019, : 658 - 662
[5] End-to-End Deep Reinforcement Learning based Recommendation with Supervised Embedding
Liu, Feng
Guo, Huifeng
Li, Xutao
Tang, Ruiming
Ye, Yunming
He, Xiuqiang
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 384 - 392
[6] End-to-End Autonomous Driving Decision Based on Deep Reinforcement Learning
Huang Z.-Q.
Qu Z.-W.
Zhang J.
Zhang Y.-X.
Tian R.
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2020, 48 (09): : 1711 - 1719
[7] NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement Learning
Haj-Ali, Ameer
Ahmed, Nesreen K.
Willke, Ted
Shao, Yakun Sophia
Asanovic, Krste
Stoica, Ion
CGO'20: PROCEEDINGS OF THE18TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2020, : 242 - 255
[8] End-to-End Deep Reinforcement Learning for Exoskeleton Control
Rose, Lowell
Bazzocchi, Michael C. F.
Nejat, Goldie
2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 4294 - 4301
[9] End-to-End Race Driving with Deep Reinforcement Learning
Jaritz, Maximilian
de Charette, Raoul
Toromanoff, Marin
Perot, Etienne
Nashashibi, Fawzi
2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 2070 - 2075
[10] End-to-End Deep Reinforcement Learning for Conversation Disentanglement
Bhukar, Karan
Kumar, Harshit
Raghu, Dinesh
Gupta, Ajay
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 12571 - 12579

← 1 2 3 4 5 →