A proximal policy optimization based deep reinforcement learning framework for tracking control of a flexible robotic manipulator

被引:0
|
作者
Kumar, V. Joshi [1 ]
Elumalai, Vinodh Kumar [1 ]
机构
[1] Vellore Inst Technol, Sch Elect Engn, Vellore 632014, Tamilnadu, India
关键词
Deep reinforcement learning; Proximal policy gradient; Policy feedback; Flexible joint manipulator; Vibration suppression;
D O I
10.1016/j.rineng.2025.104178
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
This paper puts forward a policy feedback based deep reinforcement learning (DRL) control scheme for a partially observable system by leveraging the potentials of proximal policy optimization (PPO) algorithm and convolutional neural network (CNN). Although several DRL algorithms have been investigated for a fully observable system, there has been limited studies on devising a DRL control for a partially observable system with uncertain dynamics. Moreover, the major limitation of the existing policy gradient based DRL techniques is that they are computationally expensive and suffer from scalability issues for complex higher order systems. Hence, in this study, we adopt the PPO technique which utilizes first-order optimization to minimize the computational complexity and devise a DRL scheme for a partially observable flexible link robot manipulator system. Specifically, to improve the stability and convergence in PPO algorithm, this study adopts a collaborative policy approach in the update of value function and presents a collaborative proximal policy optimization (CPPO) algorithm that can address the tracking control and vibration suppression problems in partially observable robotic manipulator system. Identifying the optimal hyper-parameters of DRL using the grid search method, we exploit the capability of CNN in actor-critic architecture to extract the spatial dependencies in the state sequences of the dynamical system and boost the DRL performance. To improve the convergence of the proposed DRL algorithm, this study adopts the Lyapunov based reward shaping technique. The experimental validation on robotic manipulator system through hardware in loop (HIL) testing substantiates that the proposed framework offers faster convergence and better vibration suppression feature compared to the state-of-the-art policy gradient technique and actor-critic technique.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] A Reinforcement-Learning Approach to Control Robotic Manipulator Based on Improved DDPG
    Majumder, Saikat
    Sahoo, Soumya Ranjan
    2023 NINTH INDIAN CONTROL CONFERENCE, ICC, 2023, : 281 - 286
  • [42] Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm
    Jayant, Ashish Kumar
    Bhatnagar, Shalabh
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [43] Automated Lane Change Strategy using Proximal Policy Optimization-based Deep Reinforcement Learning
    Ye, Fei
    Cheng, Xuxin
    Wang, Pin
    Chan, Ching-Yao
    Zhang, Jiucai
    2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 1746 - 1752
  • [44] PPO-ABR: Proximal Policy Optimization based Deep Reinforcement Learning for Adaptive BitRate streaming
    Naresh, Mandan
    Saxena, Paresh
    Gupta, Manik
    2023 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2023, : 199 - 204
  • [45] Path Planning for the Robotic Manipulator in Dynamic Environments Based on a Deep Reinforcement Learning Method
    Jie Liu
    Hwa Jen Yap
    Anis Salwa Mohd Khairuddin
    Journal of Intelligent & Robotic Systems, 111 (1)
  • [46] Model-Based Reinforcement Learning via Proximal Policy Optimization
    Sun, Yuewen
    Yuan, Xin
    Liu, Wenzhang
    Sun, Changyin
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 4736 - 4740
  • [47] Learning Variable Impedance Control for Robotic Massage With Deep Reinforcement Learning: A Novel Learning Framework
    Li, Zhuoran
    Zeng, Chao
    Deng, Zhen
    Xu, Qinling
    He, Bingwei
    Zhang, Jianwei
    IEEE SYSTEMS MAN AND CYBERNETICS MAGAZINE, 2024, 10 (01): : 17 - 27
  • [48] Proximal Policy Optimization Through a Deep Reinforcement Learning Framework for Multiple Autonomous Vehicles at a Non-Signalized Intersection
    Duy Quang Tran
    Bae, Sang-Hoon
    APPLIED SCIENCES-BASEL, 2020, 10 (16):
  • [49] Curiosity model policy optimization for robotic manipulator tracking control with input saturation in uncertain environment
    Wang, Tu
    Wang, Fujie
    Xie, Zhongye
    Qin, Feiyan
    FRONTIERS IN NEUROROBOTICS, 2024, 18
  • [50] Neural-Learning-Based Control for a Constrained Robotic Manipulator With Flexible Joints
    He, Wei
    Yan, Zichen
    Sun, Yongkun
    Ou, Yongsheng
    Sun, Changyin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (12) : 5993 - 6003