Reinforcement learning-based optimal control for Markov jump systems with completely unknown dynamics

被引:6
|
作者
Shi, Xiongtao [1 ,2 ]
Li, Yanjie [1 ,2 ]
Du, Chenglong [3 ]
Chen, Chaoyang [4 ]
Zong, Guangdeng [5 ]
Gui, Weihua [3 ]
机构
[1] Harbin Inst Technol Shenzhen, Guangdong Key Lab Intelligent Morphing Mech & Adap, Shenzhen 518055, Peoples R China
[2] Harbin Inst Technol Shenzhen, Sch Mech Engn & Automat, Shenzhen 518055, Peoples R China
[3] Cent South Univ, Sch Automat, Changsha 410083, Peoples R China
[4] Hunan Univ Sci & Technol, Sch Informat & Elect Engn, Xiangtan 411201, Peoples R China
[5] Tiangong Univ, Sch Control Sci & Engn, Tianjin 300387, Peoples R China
关键词
Markov jump systems; Optimal control; Coupled algebraic Riccati equation; Parallel policy iteration; Reinforcement learning; ADAPTIVE OPTIMAL-CONTROL; TRACKING CONTROL; LINEAR-SYSTEMS;
D O I
10.1016/j.automatica.2024.111886
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, the optimal control problem of a class of unknown Markov jump systems (MJSs) is investigated via the parallel policy iteration-based reinforcement learning (PPI-RL) algorithms. First, by solving the linear parallel Lyapunov equation, a model-based PPI-RL algorithm is studied to learn the solution of nonlinear coupled algebraic Riccati equation (CARE) of MJSs with known dynamics, thereby updating the optimal control gain. Then, a novel partially model-free PPI-RL algorithm is proposed for the scenario that the dynamics of the MJS is partially unknown, in which the optimal solution of CARE is learned via the mixed input-output data of all modes. Furthermore, for the MJS with completely unknown dynamics, a completely model-free PPI-RL algorithm is developed to get the optimal control gain by removing the dependence of model information in the process of solving the optimal solution of CARE. It is proved that the proposed PPI-RL algorithms converge to the unique optimal solution of CARE for MJSs with known, partially unknown, and completely unknown dynamics, respectively. Finally, simulation results are illustrated to show the feasibility and effectiveness of the PPI-RL algorithms.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Reinforcement learning-based adaptive optimal tracking algorithm for Markov jump systems with partial unknown dynamics
    Tu, Yidong
    Fang, Haiyang
    Wang, Hai
    Shi, Kaibo
    He, Shuping
    OPTIMAL CONTROL APPLICATIONS & METHODS, 2022, 43 (05): : 1435 - 1449
  • [2] Reinforcement learning-based composite suboptimal control for Markov jump singularly perturbed systems with unknown dynamics
    Li, Wenqian
    Jia, Guolong
    Wang, Yun
    Su, Lei
    Shen, Hao
    MATHEMATICAL METHODS IN THE APPLIED SCIENCES, 2024, 47 (14) : 11551 - 11564
  • [3] Reinforcement Learning-Based Robust Tracking Control for Unknown Markov Jump Systems and its Application
    Shen, Hao
    Wu, Jiacheng
    Wang, Yun
    Wang, Jing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (03) : 1211 - 1215
  • [4] H∞ optimal output tracking control for Markov jump systems: A reinforcement learning-based approach
    Shen, Ying
    Yao, Cai-Kang
    Chen, Bo
    Che, Wei-Wei
    Wu, Zheng-Guang
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024, 34 (08) : 5149 - 5167
  • [5] Reinforcement learning-based linear quadratic tracking control for partially unknown Markov jump singular interconnected systems
    Jia, Guolong
    Yang, Qing
    Liu, Jinxu
    Shen, Hao
    APPLIED MATHEMATICS AND COMPUTATION, 2025, 491
  • [6] Optimal tracking control for completely unknown nonlinear discrete-time Markov jump systems using data-based reinforcement learning method
    Jiang, He
    Zhang, Huaguang
    Luo, Yanhong
    Wang, Junyi
    NEUROCOMPUTING, 2016, 194 : 176 - 182
  • [7] Off-policy reinforcement learning for tracking control of discrete-time Markov jump linear systems with completely unknown dynamics
    Huang Z.
    Tu Y.
    Fang H.
    Wang H.
    Zhang L.
    Shi K.
    He S.
    Journal of the Franklin Institute, 2023, 360 (03) : 2361 - 2378
  • [8] Reinforcement Learning-Based Adaptive Optimal Exponential Tracking Control of Linear Systems With Unknown Dynamics
    Chen, Ci
    Modares, Hamidreza
    Xie, Kan
    Lewis, Frank L.
    Wan, Yan
    Xie, Shengli
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (11) : 4423 - 4438
  • [9] Reinforcement learning and adaptive optimization of a class of Markov jump systems with completely unknown dynamic information
    He, Shuping
    Zhang, Maoguang
    Fang, Haiyang
    Liu, Fei
    Luan, Xiaoli
    Ding, Zhengtao
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (18): : 14311 - 14320
  • [10] Reinforcement learning and adaptive optimization of a class of Markov jump systems with completely unknown dynamic information
    Shuping He
    Maoguang Zhang
    Haiyang Fang
    Fei Liu
    Xiaoli Luan
    Zhengtao Ding
    Neural Computing and Applications, 2020, 32 : 14311 - 14320