Reinforcement learning-based optimal control for Markov jump systems with completely unknown dynamics

被引:6
|
作者
Shi, Xiongtao [1 ,2 ]
Li, Yanjie [1 ,2 ]
Du, Chenglong [3 ]
Chen, Chaoyang [4 ]
Zong, Guangdeng [5 ]
Gui, Weihua [3 ]
机构
[1] Harbin Inst Technol Shenzhen, Guangdong Key Lab Intelligent Morphing Mech & Adap, Shenzhen 518055, Peoples R China
[2] Harbin Inst Technol Shenzhen, Sch Mech Engn & Automat, Shenzhen 518055, Peoples R China
[3] Cent South Univ, Sch Automat, Changsha 410083, Peoples R China
[4] Hunan Univ Sci & Technol, Sch Informat & Elect Engn, Xiangtan 411201, Peoples R China
[5] Tiangong Univ, Sch Control Sci & Engn, Tianjin 300387, Peoples R China
关键词
Markov jump systems; Optimal control; Coupled algebraic Riccati equation; Parallel policy iteration; Reinforcement learning; ADAPTIVE OPTIMAL-CONTROL; TRACKING CONTROL; LINEAR-SYSTEMS;
D O I
10.1016/j.automatica.2024.111886
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, the optimal control problem of a class of unknown Markov jump systems (MJSs) is investigated via the parallel policy iteration-based reinforcement learning (PPI-RL) algorithms. First, by solving the linear parallel Lyapunov equation, a model-based PPI-RL algorithm is studied to learn the solution of nonlinear coupled algebraic Riccati equation (CARE) of MJSs with known dynamics, thereby updating the optimal control gain. Then, a novel partially model-free PPI-RL algorithm is proposed for the scenario that the dynamics of the MJS is partially unknown, in which the optimal solution of CARE is learned via the mixed input-output data of all modes. Furthermore, for the MJS with completely unknown dynamics, a completely model-free PPI-RL algorithm is developed to get the optimal control gain by removing the dependence of model information in the process of solving the optimal solution of CARE. It is proved that the proposed PPI-RL algorithms converge to the unique optimal solution of CARE for MJSs with known, partially unknown, and completely unknown dynamics, respectively. Finally, simulation results are illustrated to show the feasibility and effectiveness of the PPI-RL algorithms.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Learning-Based Iterative Optimal Control for Unknown Systems Using Gaussian Process Regression
    Hashimoto, Wataru
    Hashimoto, Kazumune
    Onoue, Yuga
    Takai, Shigemasa
    2022 EUROPEAN CONTROL CONFERENCE (ECC), 2022, : 1554 - 1559
  • [42] Imitation-Based Reinforcement Learning for Markov Jump Systems and Its Application
    Wu, Jiacheng
    Wang, Jing
    Shen, Hao
    Park, Ju H.
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (08) : 3810 - 3819
  • [43] A reinforcement learning-based scheme for direct adaptive optimal control of linear stochastic systems
    Wong, Wee Chin
    Lee, Jay H.
    OPTIMAL CONTROL APPLICATIONS & METHODS, 2010, 31 (04): : 365 - 374
  • [44] A General Framework for Learning-Based Distributionally Robust MPC of Markov Jump Systems
    Schuurmans, Mathijs
    Patrinos, Panagiotis
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (05) : 2950 - 2965
  • [45] A Fuzzy-Model-Based Approach to Optimal Control for Nonlinear Markov Jump Singularly Perturbed Systems: A Novel Integral Reinforcement Learning Scheme
    Shen, Hao
    Wang, Yun
    Wang, Jing
    Park, Ju H.
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2023, 31 (10) : 3734 - 3740
  • [46] Optimal Event-Triggered H∞ Control for Nonlinear Systems with Completely Unknown Dynamics
    Chu, Kun
    Peng, Zhinan
    Zhang, Zhiquan
    Huang, Rui
    Shi, Kecheng
    Cheng, Hong
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 2236 - 2241
  • [47] Policy Iteration for Optimal Control of Weakly Coupled Nonlinear Systems with Completely Unknown Dynamics
    Li, Chao
    Wang, Ding
    Liu, Derong
    He, Haibo
    2016 AMERICAN CONTROL CONFERENCE (ACC), 2016, : 5722 - 5727
  • [48] A homotopy-based reinforcement learning scheme to optimal control for Markov switched interconnected systems
    Liu, Jinxu
    Mi, Xuanrui
    Xia, Jianwei
    Su, Lei
    Shen, Hao
    JOURNAL OF CONTROL AND DECISION, 2024,
  • [49] Constrained Reinforcement Learning-Based Closed-Loop Reference Model for Optimal Tracking Control of Unknown Continuous-Time Systems
    Zhang, Haoran
    Zhao, Chunhui
    Ding, Jinliang
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (04) : 7312 - 7324
  • [50] Nonfragile Output Feedback Tracking Control for Markov Jump Fuzzy Systems Based on Integral Reinforcement Learning Scheme
    Wang, Jing
    Wu, Jiacheng
    Cao, Jinde
    Chadli, Mohammed
    Shen, Hao
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (07) : 4521 - 4530