Q-learning-based non-zero sum games for Markov jump multiplayer systems under actor-critic NNs structure

被引:1
|
作者
Wang, Yun [1 ]
Xia, Jiawei [2 ]
Wang, Jing [1 ]
Shen, Hao [1 ]
机构
[1] Anhui Univ Technol, Sch Elect & Informat Engn, Maanshan 243032, Peoples R China
[2] Liaocheng Univ, Sch Math Sci, Liaocheng 252059, Peoples R China
基金
中国国家自然科学基金;
关键词
Markov jump systems; Q-learning; Integral reinforcement learning; Non-zero sum games; ADAPTIVE OPTIMAL-CONTROL; MULTIAGENT SYSTEMS; POWER-SYSTEMS; DESIGN; POLICY;
D O I
10.1016/j.ins.2024.121196
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article addresses the problem of non-zero sum games for Markov jump multiplayer systems (MJMSs) using the reinforcement Q-learning method. Firstly, the Q-functions for each player are derived from the system states and the control inputs. On this basis, by incorporating the integral reinforcement learning scheme and the actor-critic neural networks structure, we design a novel reinforcement learning approach for MJMSs. It should be noted that the designed algorithm does not require any information about the system dynamics and transition probabilities. Furthermore, the stochastic stability and Nash equilibrium of MJMSs can be ensured by the designed algorithm. Finally, a simulation example is presented to illustrate the effectiveness of the designed approach.
引用
收藏
页数:13
相关论文
共 15 条
  • [1] Online reinforcement learning multiplayer non-zero sum games of continuous-time Markov jump linear systems
    Xin, Xilin
    Tu, Yidong
    Stojanovic, Vladimir
    Wang, Hai
    Shi, Kaibo
    He, Shuping
    Pan, Tianhong
    APPLIED MATHEMATICS AND COMPUTATION, 2022, 412
  • [2] A Natural Actor-Critic Framework for Zero-Sum Markov Games
    Alacaoglu, Ahmet
    Viano, Luca
    He, Niao
    Cevher, Volkan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 307 - 366
  • [3] Non-Zero Sum Nash Game for Discrete-Time Infinite Markov Jump Stochastic Systems with Applications
    Liu, Yueying
    Wang, Zhen
    Lin, Xiangyun
    AXIOMS, 2023, 12 (09)
  • [4] Policy iteration-based non-zero sum differential feedback Nash control for continuous-time Markov jump linear systems
    Zhu G.-Z.
    Zhang M.-G.
    He S.-P.
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2020, 37 (08): : 1749 - 1756
  • [5] Model-free optimal tracking policies for Markov jump systems by solving non-zero-sum games
    Zhou, Peixin
    Xue, Huiwen
    Wen, Jiwei
    Shi, Peng
    Luan, Xaoli
    INFORMATION SCIENCES, 2023, 647
  • [6] Non-zero-sum games of discrete-time Markov jump systems with unknown dynamics: An off-policy reinforcement learning method
    Zhang, Xuewen
    Shen, Hao
    Li, Feng
    Wang, Jing
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024, 34 (02) : 949 - 968
  • [7] Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems
    Vamvoudakis, Kyriakos G.
    AUTOMATICA, 2015, 61 : 274 - 281
  • [8] Event-Triggered Optimal Tracking Control for Multiplayer Non-Zero-Sum Games of Nonlinear Systems via Concurrent Learning
    Qin, Yi
    Wang, Lijie
    2023 IEEE 12TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE, DDCLS, 2023, : 479 - 484
  • [9] Zero-sum game-based optimal control for discrete-time Markov jump systems: A parallel off-policy Q-learning method
    Wang, Yun
    Fang, Tian
    Kong, Qingkai
    Li, Feng
    APPLIED MATHEMATICS AND COMPUTATION, 2024, 467
  • [10] Zero-sum game-based security control of unknown nonlinear Markov jump systems under false data injection attacks
    Gao, Xiaobin
    Deng, Feiqi
    Zeng, Pengyu
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022,