End-to-End Deep Policy Feedback-Based Reinforcement Learning Method for Quantization in DNNs

被引:1
|
作者
Babu, R. Logesh [1 ]
Gurumoorthy, Sasikumar [2 ]
Parameshachari, B. D. [3 ]
Nelson, S. Christalin [4 ]
Hua, Qiaozhi [5 ]
机构
[1] Madanapalle Inst Technol & Sci, Dept Comp Sci & Engn, Chittoor 517325, Andhra Pradesh, India
[2] Jerusalem Coll Engn, Dept Comp Sci & Engn, Chennai 600100, Tamil Nadu, India
[3] GSSS Inst Engn & Technol Women, Dept Telecommun Engn, Mysuru 570011, Karnataka, India
[4] Univ Petr & Energy Studies UPES, Sch Comp Sci, Dept Syst Cluster, Dehra Dun 248007, Uttarakhand, India
[5] Hubei Univ Arts & Sci, Sch Comp, Xiangyang 441000, Hubei, Peoples R China
关键词
Constrained embedded systems; deep neural networks; long short-term memory network; policy feedback; proximal policy optimization technique; reinforcement learning method; NEURAL ARCHITECTURE SEARCH; EFFICIENCY; ACCURACY; NETWORKS;
D O I
10.1142/S0218126622502322
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the resource-constrained embedded systems, the designing of efficient deep neural networks is a challenging process, due to diversity in the artificial intelligence applications. The quantization in deep neural networks superiorly diminishes the storage and computational time by reducing the bit-width of networks encoding. In order to highlight the problem of accuracy loss, the quantization levels are automatically discovered using Policy Feedback-based Reinforcement Learning Method (PF-RELEQ). In this paper, the Proximal Policy Optimization with Policy Feedback (PPO-PF) technique is proposed to determine the best design decisions by choosing the optimum hyper-parameters. In order to enhance the sensitivity of the value function to the change of policy and to improve the accuracy of value estimation at the early learning stage, a policy update method is devised based on the clipped discount factor. In addition, specifically the loss functions of policy satisfy the unbiased estimation of the trust region. The proposed PF-RELEQ effectively balances quality and speed compared to other deep learning methods like ResNet-1202, ResNet-32, ResNet-110, GoogLeNet and AlexNet. The experimental analysis showed that PF-RELEQ achieved 20% computational work-load reduction compared to the existing deep learning methods on ImageNet, CIFAR-10, CIFAR-100 and tomato leaf disease datasets and achieved approximately 2% of improvisation in the validation accuracy. Additionally, the PF-RELEQ needs only 0.55 Graphics Processing Unit on an NVIDIA GTX-1080Ti to develop DNNs that delivers better accuracy improvement with fewer cycle counts for image classification.
引用
收藏
页数:25
相关论文
共 50 条
  • [1] End-to-end Deep Reinforcement Learning Based Coreference Resolution
    Fei, Hongliang
    Li, Xu
    Li, Dingcheng
    Li, Ping
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 660 - 665
  • [2] End-to-End AUV Local Motion Planning Method Based on Deep Reinforcement Learning
    Lyu, Xi
    Sun, Yushan
    Wang, Lifeng
    Tan, Jiehui
    Zhang, Liwen
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (09)
  • [3] Early Failure Detection of Deep End-to-End Control Policy by Reinforcement Learning
    Lee, Keuntaek
    Saigol, Kamil
    Theodorou, Evangelos A.
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 8543 - 8549
  • [4] End-to-End Autonomous Driving Decision Based on Deep Reinforcement Learning
    Huang, Zhiqing
    Zhang, Ji
    Tian, Rui
    Zhang, Yanxin
    CONFERENCE PROCEEDINGS OF 2019 5TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS (ICCAR), 2019, : 658 - 662
  • [5] End-to-End Deep Reinforcement Learning based Recommendation with Supervised Embedding
    Liu, Feng
    Guo, Huifeng
    Li, Xutao
    Tang, Ruiming
    Ye, Yunming
    He, Xiuqiang
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 384 - 392
  • [6] End-to-End Autonomous Driving Decision Based on Deep Reinforcement Learning
    Huang Z.-Q.
    Qu Z.-W.
    Zhang J.
    Zhang Y.-X.
    Tian R.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2020, 48 (09): : 1711 - 1719
  • [7] NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement Learning
    Haj-Ali, Ameer
    Ahmed, Nesreen K.
    Willke, Ted
    Shao, Yakun Sophia
    Asanovic, Krste
    Stoica, Ion
    CGO'20: PROCEEDINGS OF THE18TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2020, : 242 - 255
  • [8] End-to-End Deep Reinforcement Learning for Exoskeleton Control
    Rose, Lowell
    Bazzocchi, Michael C. F.
    Nejat, Goldie
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 4294 - 4301
  • [9] End-to-End Race Driving with Deep Reinforcement Learning
    Jaritz, Maximilian
    de Charette, Raoul
    Toromanoff, Marin
    Perot, Etienne
    Nashashibi, Fawzi
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 2070 - 2075
  • [10] End-to-End Deep Reinforcement Learning for Conversation Disentanglement
    Bhukar, Karan
    Kumar, Harshit
    Raghu, Dinesh
    Gupta, Ajay
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 12571 - 12579