Safe reinforcement learning-based control using deep deterministic policy gradient algorithm and slime mould algorithm with experimental tower crane system validation

被引:2
|
作者
Zamfirache, Iuliu Alexandru [1 ]
Precup, Radu-Emil [1 ,2 ]
Petriu, Emil M. [3 ]
机构
[1] Politehn Univ Timisoara, Dept Automat & Appl Informat, Bd V Parvan 2, Timisoara 300223, Romania
[2] Romanian Acad, Ctr Fundamental & Adv Tech Res, Timisoara Branch, Bd Mihai Viteazu 24, Timisoara 300223, Romania
[3] Univ Ottawa, Sch Elect Engn & Comp Sci, 800 King Edward, Ottawa, ON K1N 6N5, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Deep deterministic policy gradient; Optimal reference tracking control; Safe reinforcement learning; Slime mould algorithm; Tower crane systems;
D O I
10.1016/j.ins.2024.121640
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel optimal control approach resulting from the combination between the safe Reinforcement Learning (RL) framework represented by a Deep Deterministic Policy Gradient (DDPG) algorithm and a Slime Mould Algorithm (SMA) as a representative nature-inspired optimization algorithm. The main drawbacks of the traditional DDPG-based safe RL optimal control approach are the possible instability of the control system caused by randomly generated initial values of the controller parameters and the lack of state safety guarantees in the first iterations of the learning process due to (i) and (ii): (i) the safety constraints are considered only in the DDPG-based training process of the controller, which is usually implemented as a neural network (NN); (ii) the initial values of the weights and the biases of the NN-based controller are initialized with randomly generated values. The proposed approach mitigates these drawbacks by initializing the parameters of the NN-based controller using SMA. The fitness function of the SMA-based initialization process is designed to incorporate state safety constraints into the search process, resulting in an initial NN-based controller with embedded state safety constraints. The proposed approach is compared to the classical one using real-time experimental results and performance indices popular for optimal reference tracking control problems and based on a state safety score.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Deep Reinforcement Learning Automatic Landing Control of Fixed-Wing Aircraft Using Deep Deterministic Policy Gradient
    Tang, Chi
    Lai, Ying-Chih
    2020 INTERNATIONAL CONFERENCE ON UNMANNED AIRCRAFT SYSTEMS (ICUAS'20), 2020, : 1 - 9
  • [32] Enhanced Deep Deterministic Policy Gradient Algorithm Using Grey Wolf Optimizer for Continuous Control Tasks
    Sumiea, Ebrahim Hamid Hasan
    Abdulkadir, Said Jadid
    Ragab, Mohammed Gamal
    Al-Selwi, Safwan Mahmood
    Fati, Suliamn Mohamed
    Alqushaibi, Alawi
    Alhussian, Hitham
    IEEE ACCESS, 2023, 11 : 139771 - 139784
  • [33] Unmanned Aerial Vehicle Trajectory Planning and Power Control Algorithm Based on Deep Deterministic Policy Gradient
    Yang Q.
    Chen J.
    Peng Y.
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2023, 46 (03): : 43 - 48
  • [34] Information-Based Patrol Speed Control Method for Rail-Guided Robot System Using Deep Deterministic Policy Gradient Algorithm
    Lee, Hosun
    Kwon, Jaesung
    Lee, Sungon
    Chong, Nak Young
    Yang, Woosung
    INTELLIGENT AUTONOMOUS SYSTEMS 18, VOL 2, IAS18-2023, 2024, 794 : 207 - 214
  • [35] Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm
    Jayant, Ashish Kumar
    Bhatnagar, Shalabh
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [36] Optimization control of the double-capacity water tank-level system using the deep deterministic policy gradient algorithm
    Ye, Likun
    Jiang, Pei
    ENGINEERING REPORTS, 2023, 5 (11)
  • [37] Improvement of the Control of a Grid Connected Photovoltaic System Based on Synergetic and Sliding Mode Controllers Using a Reinforcement Learning Deep Deterministic Policy Gradient Agent
    Nicola, Marcel
    Nicola, Claudiu-Ionel
    Selisteanu, Dan
    ENERGIES, 2022, 15 (07)
  • [38] Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation
    Yoo, Haeun
    Kim, Boeun
    Kim, Jong Woo
    Lee, Jay H.
    COMPUTERS & CHEMICAL ENGINEERING, 2021, 144
  • [39] Improvement of PMSM Sensorless Control Based on Synergetic and Sliding Mode Controllers Using a Reinforcement Learning Deep Deterministic Policy Gradient Agent
    Nicola, Marcel
    Nicola, Claudiu-Ionel
    Selisteanu, Dan
    ENERGIES, 2022, 15 (06)
  • [40] Selective Catalytic Reduction System Ammonia Injection Control Based on Deep Deterministic Policy Reinforcement Learning
    Xie, Peiran
    Zhang, Guangming
    Niu, Yuguang
    Sun, Tianshu
    FRONTIERS IN ENERGY RESEARCH, 2021, 9