Policy Iteration Q-Learning for Linear Ito Stochastic Systems With Markovian Jumps and Its Application to Power Systems

被引:2
|
作者
Ming, Zhongyang [1 ]
Zhang, Huaguang [1 ]
Wang, Yingchun [1 ]
Dai, Jing [2 ]
机构
[1] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110004, Liaoning, Peoples R China
[2] Tsinghua Univ, Energy Internet Innovat Res Inst, Beijing 100085, Peoples R China
关键词
Markovian jump system; neural networks (NNs); Q-learning; stochastic system; ADAPTIVE OPTIMAL-CONTROL; CONTINUOUS-TIME SYSTEMS; NONLINEAR-SYSTEMS;
D O I
10.1109/TCYB.2024.3403680
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article addresses the solution of continuous-time linear Ito stochastic systems with Markovian jumps using an online policy iteration (PI) approach grounded in Q-learning. Initially, a model-dependent offline algorithm, structured according to traditional optimal control strategies, is designed to solve the algebraic Riccati equation (ARE). Employing Lyapunov theory, we rigorously derive the convergence of the offline PI algorithm and the admissibility of the iterative control law through mathematical analysis. This article represents the first attempt to tackle these technical challenges. Subsequently, to address the limitations inherent in the offline algorithm, we introduce a novel online Q-learning algorithm tailored for Ito stochastic systems with Markovian jumps. The proposed Q-learning algorithm obviates the need for transition probabilities and system matrices. We provide a thorough stability analysis of the closed-loop system. Finally, the effectiveness and applicability of the proposed algorithms are demonstrated through a simulation example, underpinned by the theorems established herein.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [31] Improved Q-Learning Method for Linear Discrete-Time Systems
    Chen, Jian
    Wang, Jinhua
    Huang, Jie
    PROCESSES, 2020, 8 (03)
  • [32] Minimax Q-learning control for linear systems using the Wasserstein metric
    Zhao, Feiran
    You, Keyou
    AUTOMATICA, 2023, 149
  • [33] Adaptive Optimal Control via Q-Learning for Ito Fuzzy Stochastic Nonlinear Continuous-Time Systems With Stackelberg Game
    Ming, Zhongyang
    Zhang, Huaguang
    Yan, Ying
    Yang, Liu
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2024, 32 (04) : 2029 - 2038
  • [34] Stochastic Linear Quadratic Optimal Control for Discrete-time Systems with Inequality Constraint and Markovian Jumps
    Wang, Wenying
    Zhang, Zhiming
    Miao, Running
    PROCEEDINGS OF THE 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER, MECHATRONICS, CONTROL AND ELECTRONIC ENGINEERING (ICCMCEE 2015), 2015, 37 : 1535 - 1543
  • [35] Reinforcement Q-Learning for PDF Tracking Control of Stochastic Systems with Unknown Dynamics
    Yang, Weiqing
    Zhou, Yuyang
    Zhang, Yong
    Ren, Yan
    MATHEMATICS, 2024, 12 (16)
  • [36] A Hybrid Multiagent Framework With Q-Learning for Power Grid Systems Restoration
    Ye, Dayong
    Zhang, Minjie
    Sutanto, Danny
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2011, 26 (04) : 2434 - 2441
  • [37] Seeking Nash Equilibrium for Linear Discrete-time Systems via Off-policy Q-learning
    Ni, Haohan
    Ji, Yuxiang
    Yang, Yuxiao
    Zhou, Jianping
    IAENG International Journal of Applied Mathematics, 2024, 54 (11) : 2477 - 2483
  • [38] Output Feedback Reinforcement Q-learning for Optimal Quadratic Tracking Control of Unknown Discrete-Time Linear Systems and Its Application
    Zhao, Guangyue
    Sun, Weijie
    Cai, He
    Peng, Yunjian
    2018 15TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), 2018, : 750 - 755
  • [39] Stochastic linear quadratic optimal tracking control for discrete-time systems with delays based on Q-learning algorithm
    Tan, Xufeng
    Li, Yuan
    Liu, Yang
    AIMS MATHEMATICS, 2023, 8 (05): : 10249 - 10265
  • [40] Predictive control of systems with Markovian jumps under constraints and its application to the investment portfolio optimization
    Dombrovskii, V. V.
    Ob''edko, T. Yu
    AUTOMATION AND REMOTE CONTROL, 2011, 72 (05) : 989 - 1003