Policy Iteration Q-Learning for Linear Ito Stochastic Systems With Markovian Jumps and Its Application to Power Systems

被引:2
|
作者
Ming, Zhongyang [1 ]
Zhang, Huaguang [1 ]
Wang, Yingchun [1 ]
Dai, Jing [2 ]
机构
[1] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110004, Liaoning, Peoples R China
[2] Tsinghua Univ, Energy Internet Innovat Res Inst, Beijing 100085, Peoples R China
关键词
Markovian jump system; neural networks (NNs); Q-learning; stochastic system; ADAPTIVE OPTIMAL-CONTROL; CONTINUOUS-TIME SYSTEMS; NONLINEAR-SYSTEMS;
D O I
10.1109/TCYB.2024.3403680
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article addresses the solution of continuous-time linear Ito stochastic systems with Markovian jumps using an online policy iteration (PI) approach grounded in Q-learning. Initially, a model-dependent offline algorithm, structured according to traditional optimal control strategies, is designed to solve the algebraic Riccati equation (ARE). Employing Lyapunov theory, we rigorously derive the convergence of the offline PI algorithm and the admissibility of the iterative control law through mathematical analysis. This article represents the first attempt to tackle these technical challenges. Subsequently, to address the limitations inherent in the offline algorithm, we introduce a novel online Q-learning algorithm tailored for Ito stochastic systems with Markovian jumps. The proposed Q-learning algorithm obviates the need for transition probabilities and system matrices. We provide a thorough stability analysis of the closed-loop system. Finally, the effectiveness and applicability of the proposed algorithms are demonstrated through a simulation example, underpinned by the theorems established herein.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [11] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
    WEI QingLai
    LIU DeRong
    ScienceChina(InformationSciences), 2015, 58 (12) : 147 - 161
  • [12] Linear Quadratic Nash Differential Games of Stochastic Singular Systems with Markovian Jumps
    Liu, Bin
    Wang, Xin
    ACTA MATHEMATICA VIETNAMICA, 2020, 45 (03) : 651 - 660
  • [13] Linear Quadratic Nash Differential Games of Stochastic Singular Systems with Markovian Jumps
    Bin Liu
    Xin Wang
    Acta Mathematica Vietnamica, 2020, 45 : 651 - 660
  • [14] Positive operator based iterative algorithms for solving Lyapunov equations for Ito stochastic systems with Markovian jumps
    Li, Zhao-Yan
    Zhou, Bin
    Lam, James
    Wang, Yong
    APPLIED MATHEMATICS AND COMPUTATION, 2011, 217 (21) : 8179 - 8195
  • [15] Functional Systems Network Outperforms Q-learning in Stochastic Environment
    Sorokin, Artyom Y.
    Burtsev, Mikhail S.
    7TH ANNUAL INTERNATIONAL CONFERENCE ON BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, (BICA 2016), 2016, 88 : 397 - 402
  • [16] Dissipative control of non-linear stochastic systems with Poisson jumps and Markovian switchings
    Lin, Z.
    Liu, J.
    Niu, Y.
    IET CONTROL THEORY AND APPLICATIONS, 2012, 6 (15): : 2367 - 2374
  • [17] Quadratic Tracking Control of Linear Stochastic Systems with Unknown Dynamics Using Average Off-Policy Q-Learning Method
    Hao, Longyan
    Wang, Chaoli
    Shi, Yibo
    MATHEMATICS, 2024, 12 (10)
  • [18] Reinforcement Q-learning based on Multirate Generalized Policy Iteration and Its Application to a 2-DOF Helicopter
    Tae Yoon Chun
    Jin Bae Park
    Yoon Ho Choi
    International Journal of Control, Automation and Systems, 2018, 16 : 377 - 386
  • [19] Reinforcement Q-learning based on Multirate Generalized Policy Iteration and Its Application to a 2-DOF Helicopter
    Chun, Tae Yoon
    Park, Jin Bae
    Choi, Yoon Ho
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2018, 16 (01) : 377 - 386
  • [20] Optimal Control Inspired Q-Learning for Switched Linear Systems
    Chen, Hua
    Zheng, Linfang
    Zhang, Wei
    2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 4003 - 4010