Reinforcement learning-based optimal control for Markov jump systems with completely unknown dynamics

被引:6
|
作者
Shi, Xiongtao [1 ,2 ]
Li, Yanjie [1 ,2 ]
Du, Chenglong [3 ]
Chen, Chaoyang [4 ]
Zong, Guangdeng [5 ]
Gui, Weihua [3 ]
机构
[1] Harbin Inst Technol Shenzhen, Guangdong Key Lab Intelligent Morphing Mech & Adap, Shenzhen 518055, Peoples R China
[2] Harbin Inst Technol Shenzhen, Sch Mech Engn & Automat, Shenzhen 518055, Peoples R China
[3] Cent South Univ, Sch Automat, Changsha 410083, Peoples R China
[4] Hunan Univ Sci & Technol, Sch Informat & Elect Engn, Xiangtan 411201, Peoples R China
[5] Tiangong Univ, Sch Control Sci & Engn, Tianjin 300387, Peoples R China
关键词
Markov jump systems; Optimal control; Coupled algebraic Riccati equation; Parallel policy iteration; Reinforcement learning; ADAPTIVE OPTIMAL-CONTROL; TRACKING CONTROL; LINEAR-SYSTEMS;
D O I
10.1016/j.automatica.2024.111886
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, the optimal control problem of a class of unknown Markov jump systems (MJSs) is investigated via the parallel policy iteration-based reinforcement learning (PPI-RL) algorithms. First, by solving the linear parallel Lyapunov equation, a model-based PPI-RL algorithm is studied to learn the solution of nonlinear coupled algebraic Riccati equation (CARE) of MJSs with known dynamics, thereby updating the optimal control gain. Then, a novel partially model-free PPI-RL algorithm is proposed for the scenario that the dynamics of the MJS is partially unknown, in which the optimal solution of CARE is learned via the mixed input-output data of all modes. Furthermore, for the MJS with completely unknown dynamics, a completely model-free PPI-RL algorithm is developed to get the optimal control gain by removing the dependence of model information in the process of solving the optimal solution of CARE. It is proved that the proposed PPI-RL algorithms converge to the unique optimal solution of CARE for MJSs with known, partially unknown, and completely unknown dynamics, respectively. Finally, simulation results are illustrated to show the feasibility and effectiveness of the PPI-RL algorithms.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Optimal control for continuous-time Markov jump singularly perturbed systems : A hybrid reinforcement learning scheme
    Huang, Yaling
    Li, Wenqian
    Wang, Yun
    Shen, Hao
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2024, 361 (07):
  • [32] Reinforcement learning-based robust optimal tracking control for disturbed nonlinear systems
    Zhong-Xin Fan
    Lintao Tang
    Shihua Li
    Rongjie Liu
    Neural Computing and Applications, 2023, 35 : 23987 - 23996
  • [33] Stability and Fuzzy Optimal Control for Nonlinear Ito Stochastic Markov Jump Systems via Hybrid Reinforcement Learning
    Pang, Zhen
    Wang, Hai
    Cheng, Jun
    Tang, Shengda
    Park, Ju H.
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2024, 32 (11) : 6472 - 6485
  • [34] Reinforcement Learning-Based Adaptive Optimal Control for Nonlinear Systems With Asymmetric Hysteresis
    Zheng, Licheng
    Liu, Zhi
    Wang, Yaonan
    Chen, C. L. Philip
    Zhang, Yun
    Wu, Zongze
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 15800 - 15809
  • [35] Reinforcement learning-based robust optimal tracking control for disturbed nonlinear systems
    Fan, Zhong-Xin
    Tang, Lintao
    Li, Shihua
    Liu, Rongjie
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (33): : 23987 - 23996
  • [36] Adaptive Optimal Consensus Control of Multiagent Systems With Unknown Dynamics and Disturbances via Reinforcement Learning
    Chen L.
    Dong C.
    Dai S.-L.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (05): : 2193 - 2203
  • [37] Adaptive Optimal Control of Nonlinear Active Suspension Systems with Completely Unknown Dynamics
    Chen, Xin
    Huang, Yingbo
    Na, Jing
    Gao, Guanbin
    Zhao, Jun
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 3524 - 3529
  • [38] Distributed Tracking Control of Completely Unknown Heterogeneous MASs Based on Reinforcement Learning
    Wang, Zhipeng
    Huo, Shicheng
    2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 587 - 592
  • [39] A learning-based approach to event-triggered guaranteed cost control for completely unknown nonlinear systems
    Liang, Yuling
    Zhang, Jun
    Zhao, Hui
    Su, Hanguang
    Cui, Xiaohong
    TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2024, 46 (06) : 1203 - 1218
  • [40] Reinforcement Learning-Based Control for Nonlinear Discrete-Time Systems with Unknown Control Directions and Control Constraints
    Huang, Miao
    Liu, Cong
    He, Xiaoqi
    Ma, Longhua
    Lu, Zheming
    Su, Hongye
    NEUROCOMPUTING, 2020, 402 : 50 - 65