End-to-End Deep Policy Feedback-Based Reinforcement Learning Method for Quantization in DNNs

被引:1
|
作者
Babu, R. Logesh [1 ]
Gurumoorthy, Sasikumar [2 ]
Parameshachari, B. D. [3 ]
Nelson, S. Christalin [4 ]
Hua, Qiaozhi [5 ]
机构
[1] Madanapalle Inst Technol & Sci, Dept Comp Sci & Engn, Chittoor 517325, Andhra Pradesh, India
[2] Jerusalem Coll Engn, Dept Comp Sci & Engn, Chennai 600100, Tamil Nadu, India
[3] GSSS Inst Engn & Technol Women, Dept Telecommun Engn, Mysuru 570011, Karnataka, India
[4] Univ Petr & Energy Studies UPES, Sch Comp Sci, Dept Syst Cluster, Dehra Dun 248007, Uttarakhand, India
[5] Hubei Univ Arts & Sci, Sch Comp, Xiangyang 441000, Hubei, Peoples R China
关键词
Constrained embedded systems; deep neural networks; long short-term memory network; policy feedback; proximal policy optimization technique; reinforcement learning method; NEURAL ARCHITECTURE SEARCH; EFFICIENCY; ACCURACY; NETWORKS;
D O I
10.1142/S0218126622502322
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the resource-constrained embedded systems, the designing of efficient deep neural networks is a challenging process, due to diversity in the artificial intelligence applications. The quantization in deep neural networks superiorly diminishes the storage and computational time by reducing the bit-width of networks encoding. In order to highlight the problem of accuracy loss, the quantization levels are automatically discovered using Policy Feedback-based Reinforcement Learning Method (PF-RELEQ). In this paper, the Proximal Policy Optimization with Policy Feedback (PPO-PF) technique is proposed to determine the best design decisions by choosing the optimum hyper-parameters. In order to enhance the sensitivity of the value function to the change of policy and to improve the accuracy of value estimation at the early learning stage, a policy update method is devised based on the clipped discount factor. In addition, specifically the loss functions of policy satisfy the unbiased estimation of the trust region. The proposed PF-RELEQ effectively balances quality and speed compared to other deep learning methods like ResNet-1202, ResNet-32, ResNet-110, GoogLeNet and AlexNet. The experimental analysis showed that PF-RELEQ achieved 20% computational work-load reduction compared to the existing deep learning methods on ImageNet, CIFAR-10, CIFAR-100 and tomato leaf disease datasets and achieved approximately 2% of improvisation in the validation accuracy. Additionally, the PF-RELEQ needs only 0.55 Graphics Processing Unit on an NVIDIA GTX-1080Ti to develop DNNs that delivers better accuracy improvement with fewer cycle counts for image classification.
引用
收藏
页数:25
相关论文
共 50 条
  • [21] ORACLE: End-to-End Model Based Reinforcement Learning
    Andersen, Per-Arne
    Goodwin, Morten
    Granmo, Ole-Christoffer
    ARTIFICIAL INTELLIGENCE XXXVIII, 2021, 13101 : 44 - 57
  • [22] A Deep Learning-Based End-To-End CT Reconstruction Method
    Lu, K.
    Ren, L.
    Yin, F.
    MEDICAL PHYSICS, 2020, 47 (06) : E507 - E508
  • [23] Off-policy model-based end-to-end safe reinforcement learning
    Kanso, Soha
    Jha, Mayank Shekhar
    Theilliol, Didier
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024, 34 (04) : 2806 - 2831
  • [24] Eagle: End-to-end Deep Reinforcement Learning based Autonomous Control of PTZ Cameras
    Sandha, Sandeep Singh
    Balaji, Bharathan
    Garcia, Luis
    Srivastava, Mani
    PROCEEDINGS 8TH ACM/IEEE CONFERENCE ON INTERNET OF THINGS DESIGN AND IMPLEMENTATION, IOTDI 2023, 2023, : 144 - 157
  • [25] End-to-End Autonomous Navigation Based on Deep Reinforcement Learning with a Survival Penalty Function
    Jeng, Shyr-Long
    Chiang, Chienhsun
    SENSORS, 2023, 23 (20)
  • [26] End-to-End Deep Reinforcement Learning for Image-Based UAV Autonomous Control
    Zhao, Jiang
    Sun, Jiaming
    Cai, Zhihao
    Wang, Longhong
    Wang, Yingxun
    APPLIED SCIENCES-BASEL, 2021, 11 (18):
  • [27] An End-to-End Detection Method for WebShell with Deep Learning
    Qi, Longchen
    Kong, Rui
    Lu, Yang
    Zhuang, Honglin
    2018 EIGHTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2018), 2018, : 660 - 665
  • [28] End-to-End Learning of Deep Visuomotor Policy for Needle Picking
    Lin, Hongbin
    Li, Bin
    Chu, Xiangyu
    Dou, Qi
    Liu, Yunhui
    Au, Kwok Wai Samuel
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 8487 - 8494
  • [29] Deep Reinforcement Learning for End-to-End Network Slicing: Challenges and Solutions
    Liu, Qiang
    Choi, Nakjung
    Han, Tao
    IEEE NETWORK, 2023, 37 (02): : 222 - 228
  • [30] End-to-end sensorimotor control problems of AUVs with deep reinforcement learning
    Wu, Hui
    Song, Shiji
    Hsu, Yachu
    You, Keyou
    Wu, Cheng
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 5869 - 5874