Policy Iteration Q-Learning for Linear Ito Stochastic Systems With Markovian Jumps and Its Application to Power Systems

被引:2
|
作者
Ming, Zhongyang [1 ]
Zhang, Huaguang [1 ]
Wang, Yingchun [1 ]
Dai, Jing [2 ]
机构
[1] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110004, Liaoning, Peoples R China
[2] Tsinghua Univ, Energy Internet Innovat Res Inst, Beijing 100085, Peoples R China
关键词
Markovian jump system; neural networks (NNs); Q-learning; stochastic system; ADAPTIVE OPTIMAL-CONTROL; CONTINUOUS-TIME SYSTEMS; NONLINEAR-SYSTEMS;
D O I
10.1109/TCYB.2024.3403680
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article addresses the solution of continuous-time linear Ito stochastic systems with Markovian jumps using an online policy iteration (PI) approach grounded in Q-learning. Initially, a model-dependent offline algorithm, structured according to traditional optimal control strategies, is designed to solve the algebraic Riccati equation (ARE). Employing Lyapunov theory, we rigorously derive the convergence of the offline PI algorithm and the admissibility of the iterative control law through mathematical analysis. This article represents the first attempt to tackle these technical challenges. Subsequently, to address the limitations inherent in the offline algorithm, we introduce a novel online Q-learning algorithm tailored for Ito stochastic systems with Markovian jumps. The proposed Q-learning algorithm obviates the need for transition probabilities and system matrices. We provide a thorough stability analysis of the closed-loop system. Finally, the effectiveness and applicability of the proposed algorithms are demonstrated through a simulation example, underpinned by the theorems established herein.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [41] Predictive control of systems with Markovian jumps under constraints and its application to the investment portfolio optimization
    V. V. Dombrovskii
    T. Yu. Ob”edko
    Automation and Remote Control, 2011, 72 : 989 - 1003
  • [42] A Deep Q-Learning Bisection Approach for Power Allocation in Downlink NOMA Systems
    Youssef, Marie-Josepha
    Nour, Charbel Abdel
    Lagrange, Xavier
    Douillard, Catherine
    IEEE COMMUNICATIONS LETTERS, 2022, 26 (02) : 316 - 320
  • [43] Stabilizing value iteration Q-learning for online evolving control of discrete-time nonlinear systems
    Zhao, Mingming
    Wang, Ding
    Qiao, Junfei
    NONLINEAR DYNAMICS, 2024, 112 (11) : 9137 - 9153
  • [44] Viability for Ito stochastic systems with non-Lipschitzian coefficients and its application
    Shi, Xuejun
    Feng, Qun
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024, 53 (01) : 446 - 456
  • [45] Off-policy Q-learning: Optimal tracking control for networked control systems
    Li J.-N.
    Yin Z.-X.
    Kongzhi yu Juece/Control and Decision, 2019, 34 (11): : 2343 - 2349
  • [46] Stochastic linear quadratic optimal control for continuous-time systems based on policy iteration
    College of Information Science and Engineering,, Northeastern University,, Shenyang
    110004, China
    不详
    110034, China
    Kongzhi yu Juece Control Decis, 9 (1674-1678):
  • [47] Stochastic linear quadratic optimal control for model-free discrete-time systems based on Q-learning algorithm
    Wang, Tao
    Zhang, Huaguang
    Luo, Yanhong
    NEUROCOMPUTING, 2018, 312 : 1 - 8
  • [48] Spectral characterisation for stability and stabilisation of linear stochastic systems with Markovian switching and its applications
    Sheng, Li
    Gao, Ming
    Zhang, Weihai
    IET CONTROL THEORY AND APPLICATIONS, 2013, 7 (05): : 730 - 737
  • [49] Cooperative Q-Learning for Rejection of Persistent Adversarial Inputs in Networked Linear Quadratic Systems
    Vamvoudakis, Kyriakos G.
    Hespanha, Joao P.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (04) : 1018 - 1031
  • [50] Robust Inverse Q-Learning for Continuous-Time Linear Systems in Adversarial Environments
    Lian, Bosen
    Xue, Wenqian
    Lewis, Frank L.
    Chai, Tianyou
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (12) : 13083 - 13095