Safe reinforcement learning-based control using deep deterministic policy gradient algorithm and slime mould algorithm with experimental tower crane system validation

被引:2
|
作者
Zamfirache, Iuliu Alexandru [1 ]
Precup, Radu-Emil [1 ,2 ]
Petriu, Emil M. [3 ]
机构
[1] Politehn Univ Timisoara, Dept Automat & Appl Informat, Bd V Parvan 2, Timisoara 300223, Romania
[2] Romanian Acad, Ctr Fundamental & Adv Tech Res, Timisoara Branch, Bd Mihai Viteazu 24, Timisoara 300223, Romania
[3] Univ Ottawa, Sch Elect Engn & Comp Sci, 800 King Edward, Ottawa, ON K1N 6N5, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Deep deterministic policy gradient; Optimal reference tracking control; Safe reinforcement learning; Slime mould algorithm; Tower crane systems;
D O I
10.1016/j.ins.2024.121640
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel optimal control approach resulting from the combination between the safe Reinforcement Learning (RL) framework represented by a Deep Deterministic Policy Gradient (DDPG) algorithm and a Slime Mould Algorithm (SMA) as a representative nature-inspired optimization algorithm. The main drawbacks of the traditional DDPG-based safe RL optimal control approach are the possible instability of the control system caused by randomly generated initial values of the controller parameters and the lack of state safety guarantees in the first iterations of the learning process due to (i) and (ii): (i) the safety constraints are considered only in the DDPG-based training process of the controller, which is usually implemented as a neural network (NN); (ii) the initial values of the weights and the biases of the NN-based controller are initialized with randomly generated values. The proposed approach mitigates these drawbacks by initializing the parameters of the NN-based controller using SMA. The fitness function of the SMA-based initialization process is designed to incorporate state safety constraints into the search process, resulting in an initial NN-based controller with embedded state safety constraints. The proposed approach is compared to the classical one using real-time experimental results and performance indices popular for optimal reference tracking control problems and based on a state safety score.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Deep Deterministic Policy Gradient Algorithm based Lateral and Longitudinal Control for Autonomous Driving
    Zhu Gongsheng
    Pei Chunmei
    Ding Jiang
    Shi Junfeng
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 736 - 741
  • [22] A Novel Deep Deterministic Policy Gradient Assisted Learning-Based Control Algorithm for Three-Phase DC/AC Inverter With an RL Load
    Xiang, Chaoqun
    Zhang, Xinan
    Qie, Tianhao
    Chau, Tat Kei
    Ye, Jian
    Yu, Yang
    Iu, Herbert Ho Ching
    Fernando, Tyrone
    IEEE JOURNAL OF EMERGING AND SELECTED TOPICS IN POWER ELECTRONICS, 2023, 11 (06) : 5529 - 5539
  • [23] A theoretical demonstration for reinforcement learning of PI control dynamics for optimal speed control of DC motors by using Twin Delay Deep Deterministic Policy Gradient Algorithm
    Tufenkci, Sevilay
    Alagoz, Baris Baykant
    Kavuran, Gurkan
    Yeroglu, Celaleddin
    Herencsar, Norbert
    Mahata, Shibendu
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
  • [24] Continuous Control for Automated Lane Change Behavior Based on Deep Deterministic Policy Gradient Algorithm
    Wang, Pin
    Li, Hanhan
    Chan, Ching-Yao
    2019 30TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV19), 2019, : 1454 - 1460
  • [25] Course Tracking Control for Smart Ships Based on A Deep Deterministic Policy Gradient-based Algorithm
    Wang, Wei-ye
    Ma, Feng
    Liu, Jialun
    2019 5TH INTERNATIONAL CONFERENCE ON TRANSPORTATION INFORMATION AND SAFETY (ICTIS 2019), 2019, : 1400 - 1404
  • [26] Robot Confrontation Based On Genetic Fuzzy System Guided Deep Deterministic Policy Gradient Algorithm
    Gu, Mingyang
    Guo, Xian
    Zhang, Xuebo
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 538 - 544
  • [27] Deep deterministic policy gradient reinforcement learning based temperature control of a fermentation bioreactor for ethanol production
    Rajasekhar, N.
    Radhakrishnan, T. K.
    Samsudeen, N.
    JOURNAL OF THE INDIAN CHEMICAL SOCIETY, 2025, 102 (02)
  • [28] Intelligent ship anti-rolling control system based on a deep deterministic policy gradient algorithm and the Magnus effect
    Lin, Jianfeng
    Han, Yang
    Guo, Chunyu
    Su, Yumin
    Zhong, Ruofan
    PHYSICS OF FLUIDS, 2022, 34 (05)
  • [29] Agent-Based Modeling in Electricity Market Using Deep Deterministic Policy Gradient Algorithm
    Liang, Yanchang
    Guo, Chunlin
    Ding, Zhaohao
    Hua, Huichun
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2020, 35 (06) : 4180 - 4192
  • [30] Agent-Based Energy Sharing Mechanism Using Deep Deterministic Policy Gradient Algorithm
    Kuang, Yi
    Wang, Xiuli
    Zhao, Hongyang
    Huang, Yijun
    Chen, Xianlong
    Wang, Xifan
    ENERGIES, 2020, 13 (19)