Zero-sum game-based optimal control for discrete-time Markov jump systems: A parallel off-policy Q-learning method

被引:1
|
作者
Wang, Yun [1 ]
Fang, Tian [1 ]
Kong, Qingkai [1 ]
Li, Feng [1 ]
机构
[1] Anhui Univ Technol, Anhui Prov Key Lab Power Elect & Mot Control, Maanshan 243002, Peoples R China
基金
中国国家自然科学基金;
关键词
Markov jump systems; Optimal control; Q-learning method; Game-coupled algebraic Riccati equation; H-INFINITY CONTROL; LINEAR-SYSTEMS; CONTROL DESIGN; NETWORK;
D O I
10.1016/j.amc.2023.128462
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this paper, the zero-sum game problem for linear discrete-time Markov jump systems is solved by two novel model-free reinforcement Q-learning algorithms, on-policy Q-learning and off -policy Q-learning. Firstly, under the framework of the zero-sum game, the game-coupled algebraic Riccati equation is derived. On this basis, subsystem transformation technology is employed to decouple the jumping modes. Then, a model-free on-policy Q-learning algorithm is introduced in the zero-sum game architecture to obtain the optimal control gain by measured system data. However, the probing noise will produce biases in on-policy algorithm. Thus, an off-policy Q -learning algorithm is proposed to eliminate the effect of probing noise. Subsequently, convergence is discussed for the proposed methods. Finally, an inverted pendulum system is employed to verify the validity of the proposed methods.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Off-policy Reinforcement Learning for Robust Control of Discrete-time Uncertain Linear Systems
    Yang, Yongliang
    Guo, Zhishan
    Wunsch, Donald
    Yin, Yixin
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 2507 - 2512
  • [42] Actor-Critic Off-Policy Learning for Optimal Control of Multiple-Model Discrete-Time Systems
    Skach, Jan
    Kiumarsi, Bahare
    Lewis, Frank L.
    Straka, Ondrej
    IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (01) : 29 - 40
  • [43] A Zero-Sum Game-Based Hybrid Iteration Reinforcement Learning Scheme to Optimal Control for Fuzzy Singularly Perturbed Systems
    Dong, Jie
    Wang, Yun
    Su, Lei
    Shen, Hao
    INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2025,
  • [44] Off-Policy Reinforcement Learning for Optimal Preview Tracking Control of Linear Discrete-Time systems with unknown dynamics
    Wang, Chao-Ran
    Wu, Huai-Ning
    2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 1402 - 1407
  • [45] Off-policy safe reinforcement learning for nonlinear discrete-time systems
    Jha, Mayank Shekhar
    Kiumarsi, Bahare
    NEUROCOMPUTING, 2025, 611
  • [46] Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control
    Al-Tamimi, Asma
    Lewis, Frank L.
    Abu-Khalaf, Murad
    AUTOMATICA, 2007, 43 (03) : 473 - 481
  • [47] Robust control design for zero-sum differential games problem based on off-policy reinforcement learning technique
    Zhuang H.
    Zhu H.
    Wu S.
    Wang X.
    Mu Z.
    Shen Q.
    Aerospace Systems, 2024, 7 (02) : 261 - 269
  • [48] Reinforcement Q-Learning and Non-Zero-Sum Games Optimal Tracking Control for Discrete-Time Linear Multi-Input Systems
    Zhao, Jin-Gang
    2023 IEEE 12TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE, DDCLS, 2023, : 277 - 282
  • [49] Continual Reinforcement Learning Formulation for Zero-Sum Game-Based Constrained Optimal Tracking
    Farzanegan, Behzad
    Jagannathan, Sarangapani
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (12): : 7744 - 7757
  • [50] Optimal Control for Interconnected Multi-Area Power Systems With Unknown Dynamics: An Off-Policy Q-Learning Method
    Wang, Jing
    Mi, Xuanrui
    Shen, Hao
    Park, Ju H.
    Shi, Kaibo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (05) : 2849 - 2853