Zero-sum game-based optimal control for discrete-time Markov jump systems: A parallel off-policy Q-learning method

被引:1
|
作者
Wang, Yun [1 ]
Fang, Tian [1 ]
Kong, Qingkai [1 ]
Li, Feng [1 ]
机构
[1] Anhui Univ Technol, Anhui Prov Key Lab Power Elect & Mot Control, Maanshan 243002, Peoples R China
基金
中国国家自然科学基金;
关键词
Markov jump systems; Optimal control; Q-learning method; Game-coupled algebraic Riccati equation; H-INFINITY CONTROL; LINEAR-SYSTEMS; CONTROL DESIGN; NETWORK;
D O I
10.1016/j.amc.2023.128462
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this paper, the zero-sum game problem for linear discrete-time Markov jump systems is solved by two novel model-free reinforcement Q-learning algorithms, on-policy Q-learning and off -policy Q-learning. Firstly, under the framework of the zero-sum game, the game-coupled algebraic Riccati equation is derived. On this basis, subsystem transformation technology is employed to decouple the jumping modes. Then, a model-free on-policy Q-learning algorithm is introduced in the zero-sum game architecture to obtain the optimal control gain by measured system data. However, the probing noise will produce biases in on-policy algorithm. Thus, an off-policy Q -learning algorithm is proposed to eliminate the effect of probing noise. Subsequently, convergence is discussed for the proposed methods. Finally, an inverted pendulum system is employed to verify the validity of the proposed methods.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Reinforcement Q-learning algorithm for H∞ tracking control of discrete-time Markov jump systems
    Shi, Jiahui
    He, Dakuo
    Zhang, Qiang
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2025, 56 (03) : 502 - 523
  • [22] Off-policy Q-learning-based Tracking Control for Stochastic Linear Discrete-Time Systems
    Liu, Xuantong
    Zhang, Lei
    Peng, Yunjian
    2022 4TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS, ICCR, 2022, : 252 - 256
  • [23] Optimal Tracking of Nonlinear Discrete-time Systems using Zero-Sum Game Formulation and Hybrid Learning
    Farzanegan, Behzad
    Jagannathan, S.
    2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 2715 - 2720
  • [24] Decentralized Zero-sum Games for Nonlinear Systems Based on Off-policy Learning Scheme
    Luo, Hao
    Mu, Chaoxu
    Yu, Lifu
    Wang, Ke
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 2155 - 2160
  • [25] Neural Q-learning for discrete-time nonlinear zero-sum games with adjustable convergence rate
    Wang, Yuan
    Wang, Ding
    Zhao, Mingming
    Liu, Nan
    Qiao, Junfei
    NEURAL NETWORKS, 2024, 175
  • [26] Fuzzy-Based Adaptive Optimization of Unknown Discrete-Time Nonlinear Markov Jump Systems With Off-Policy Reinforcement Learning
    Fang, Haiyang
    Tu, Yidong
    Wang, Hai
    He, Shuping
    Liu, Fei
    Ding, Zhengtao
    Cheng, Shing Shin
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2022, 30 (12) : 5276 - 5290
  • [27] Finite-horizon Q-learning for discrete-time zero-sum games with application to H∞$$ {H}_{\infty } $$ control
    Liu, Mingxiang
    Cai, Qianqian
    Meng, Wei
    Li, Dandan
    Fu, Minyue
    ASIAN JOURNAL OF CONTROL, 2023, 25 (04) : 3160 - 3168
  • [28] Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control
    Rizvi, Syed Ali Asad
    Lin, Zongli
    AUTOMATICA, 2018, 95 : 213 - 221
  • [29] Output feedback Q-learning for discrete-time finite-horizon zero-sum games with application to the H? control
    Liu, Mingxiang
    Cai, Qianqian
    Li, Dandan
    Meng, Wei
    Fu, Minyue
    NEUROCOMPUTING, 2023, 529 : 48 - 55
  • [30] Zero-sum game-based security control of unknown nonlinear Markov jump systems under false data injection attacks
    Gao, Xiaobin
    Deng, Feiqi
    Zeng, Pengyu
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022,