Zero-sum game-based optimal control for discrete-time Markov jump systems: A parallel off-policy Q-learning method

被引:1
|
作者
Wang, Yun [1 ]
Fang, Tian [1 ]
Kong, Qingkai [1 ]
Li, Feng [1 ]
机构
[1] Anhui Univ Technol, Anhui Prov Key Lab Power Elect & Mot Control, Maanshan 243002, Peoples R China
基金
中国国家自然科学基金;
关键词
Markov jump systems; Optimal control; Q-learning method; Game-coupled algebraic Riccati equation; H-INFINITY CONTROL; LINEAR-SYSTEMS; CONTROL DESIGN; NETWORK;
D O I
10.1016/j.amc.2023.128462
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this paper, the zero-sum game problem for linear discrete-time Markov jump systems is solved by two novel model-free reinforcement Q-learning algorithms, on-policy Q-learning and off -policy Q-learning. Firstly, under the framework of the zero-sum game, the game-coupled algebraic Riccati equation is derived. On this basis, subsystem transformation technology is employed to decouple the jumping modes. Then, a model-free on-policy Q-learning algorithm is introduced in the zero-sum game architecture to obtain the optimal control gain by measured system data. However, the probing noise will produce biases in on-policy algorithm. Thus, an off-policy Q -learning algorithm is proposed to eliminate the effect of probing noise. Subsequently, convergence is discussed for the proposed methods. Finally, an inverted pendulum system is employed to verify the validity of the proposed methods.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Robust optimal tracking control for multiplayer systems by off-policy Q-learning approach
    Li, Jinna
    Xiao, Zhenfei
    Li, Ping
    Cao, Jiangtao
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (01) : 87 - 106
  • [32] Data-Driven Nonzero-Sum Game for Discrete-Time Systems Using Off-Policy Reinforcement Learning
    Yang, Yongliang
    Zhang, Sen
    Dong, Jie
    Yin, Yixin
    IEEE ACCESS, 2020, 8 : 14074 - 14088
  • [33] H∞ control of linear discrete-time systems: Off-policy reinforcement learning
    Kiumarsi, Bahare
    Lewis, Frank L.
    Jiang, Zhong-Ping
    AUTOMATICA, 2017, 78 : 144 - 152
  • [34] H∞ Optimal Control of Unknown Linear Discrete-time Systems: An Off-policy Reinforcement Learning Approach
    Kiumarsi, Bahare
    Modares, Hamidreza
    Lewis, Frank L.
    Jiang, Zhong-Ping
    PROCEEDINGS OF THE 2015 7TH IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS (CIS) AND ROBOTICS, AUTOMATION AND MECHATRONICS (RAM), 2015, : 41 - 46
  • [35] Optimal Control for Fuzzy Markov Jump Singularly Perturbed Systems: A Hybrid Zero-Sum Game Iteration Approach
    Wang, Jing
    Huang, Yaling
    Xie, Xiangpeng
    Yan, Huaicheng
    Shen, Hao
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2024, 32 (11) : 6388 - 6398
  • [36] Discrete-Time Optimal Control Scheme Based on Q-Learning Algorithm
    Wei, Qinglai
    Liu, Derong
    Song, Ruizhuo
    2016 SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2016, : 125 - 130
  • [37] Zero-Sum Game-Based Optimal Secure Control Under Actuator Attacks
    Wu, Chengwei
    Li, Xiaolei
    Pan, Wei
    Liu, Jianxing
    Wu, Ligang
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (08) : 3773 - 3780
  • [38] H∞ Tracking Control of Unknown Discrete-Time Linear Systems via Output-Data-Driven Off-policy Q-learning Algorithm
    Zhang, Kun
    Liu, Xuantong
    Zhang, Lei
    Chen, Qian
    Peng, Yunjian
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 2350 - 2356
  • [39] Non-Zero Sum Nash Game for Discrete-Time Infinite Markov Jump Stochastic Systems with Applications
    Liu, Yueying
    Wang, Zhen
    Lin, Xiangyun
    AXIOMS, 2023, 12 (09)
  • [40] Optimal control and zero-sum game subject to multifactor uncertain random systems with jump
    Chen, Xin
    Tian, Chenlei
    Jin, Ting
    OPTIMIZATION, 2025, 74 (04) : 981 - 1022