Safe reinforcement learning-based control using deep deterministic policy gradient algorithm and slime mould algorithm with experimental tower crane system validation

被引:2
|
作者
Zamfirache, Iuliu Alexandru [1 ]
Precup, Radu-Emil [1 ,2 ]
Petriu, Emil M. [3 ]
机构
[1] Politehn Univ Timisoara, Dept Automat & Appl Informat, Bd V Parvan 2, Timisoara 300223, Romania
[2] Romanian Acad, Ctr Fundamental & Adv Tech Res, Timisoara Branch, Bd Mihai Viteazu 24, Timisoara 300223, Romania
[3] Univ Ottawa, Sch Elect Engn & Comp Sci, 800 King Edward, Ottawa, ON K1N 6N5, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Deep deterministic policy gradient; Optimal reference tracking control; Safe reinforcement learning; Slime mould algorithm; Tower crane systems;
D O I
10.1016/j.ins.2024.121640
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel optimal control approach resulting from the combination between the safe Reinforcement Learning (RL) framework represented by a Deep Deterministic Policy Gradient (DDPG) algorithm and a Slime Mould Algorithm (SMA) as a representative nature-inspired optimization algorithm. The main drawbacks of the traditional DDPG-based safe RL optimal control approach are the possible instability of the control system caused by randomly generated initial values of the controller parameters and the lack of state safety guarantees in the first iterations of the learning process due to (i) and (ii): (i) the safety constraints are considered only in the DDPG-based training process of the controller, which is usually implemented as a neural network (NN); (ii) the initial values of the weights and the biases of the NN-based controller are initialized with randomly generated values. The proposed approach mitigates these drawbacks by initializing the parameters of the NN-based controller using SMA. The fitness function of the SMA-based initialization process is designed to incorporate state safety constraints into the search process, resulting in an initial NN-based controller with embedded state safety constraints. The proposed approach is compared to the classical one using real-time experimental results and performance indices popular for optimal reference tracking control problems and based on a state safety score.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Fractional-Order Control Method Based on Twin-Delayed Deep Deterministic Policy Gradient Algorithm
    Jiao, Guangxin
    An, Zhengcai
    Shao, Shuyi
    Sun, Dong
    FRACTAL AND FRACTIONAL, 2024, 8 (02)
  • [42] Cooperative Control of Power Grid Frequency Based on Expert-Guided Deep Deterministic Policy Gradient Algorithm
    Shen, Tao
    Zhang, Jing
    He, Yu
    Yang, Shengsun
    Zhang, Demu
    Yang, Zhaorui
    IEEE ACCESS, 2025, 13 : 38502 - 38514
  • [43] Energy Scheduling of Hydrogen Hybrid UAV Based on Model Predictive Control and Deep Deterministic Policy Gradient Algorithm
    Li, Haitao
    Wang, Chenyu
    Yuan, Shufu
    Zhu, Hui
    Li, Bo
    Liu, Yuexin
    Sun, Li
    ALGORITHMS, 2025, 18 (02)
  • [44] Unmanned Surface Vehicle Course Tracking Control Based on Neural Network and Deep Deterministic Policy Gradient Algorithm
    Wang, Yan
    Tong, Jie
    Song, Tian-Yu
    Wan, Zhan-Hong
    2018 OCEANS - MTS/IEEE KOBE TECHNO-OCEANS (OTO), 2018,
  • [45] Deep Deterministic Policy Gradient Algorithm Based Reinforcement Learning Controller for Single-Inductor Multiple-Output DC-DC Converter
    Ye, Jian
    Guo, Huanyu
    Wang, Benfei
    Zhang, Xinan
    IEEE TRANSACTIONS ON POWER ELECTRONICS, 2024, 39 (04) : 4078 - 4090
  • [46] Deep Reinforcement Learning-Based Anti-Jamming Algorithm Using Dual Action Network
    Li, Xiangchen
    Chen, Jienan
    Ling, Xiang
    Wu, Tingyong
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2023, 22 (07) : 4625 - 4637
  • [47] 50% reduction in energy consumption in an actual cold storage facility using a deep reinforcement learning-based control algorithm
    Park, Jong-Whi
    Ju, Young-Min
    Kim, You-Gwon
    Kim, Hak-Sung
    APPLIED ENERGY, 2023, 352
  • [48] An Enhanced Load Frequency Control Strategy for Variable-Speed Pumped Storage System Based on Deep Deterministic Policy Gradient Algorithm
    Zhou, Mm
    Ding, Jinghuan
    Wang, Shu
    Li, Shanying
    Yan, Yian
    2024 IEEE 2ND INTERNATIONAL CONFERENCE ON POWER SCIENCE AND TECHNOLOGY, ICPST 2024, 2024, : 964 - 969
  • [49] Active control of flexible rotors using deep reinforcement learning with application of multi-actor-critic deep deterministic policy gradient
    Ahmed, Maheed H.
    AboHussien, Abdullah
    El-Shafei, Aly
    Darwish, Ahmed M.
    Abdel-Gawad, Ahmed H.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 124
  • [50] Safe Off-Policy Deep Reinforcement Learning Algorithm for Volt-VAR Control in Power Distribution Systems
    Wang, Wei
    Yu, Nanpeng
    Gao, Yuanqi
    Shi, Jie
    IEEE TRANSACTIONS ON SMART GRID, 2020, 11 (04) : 3008 - 3018