Q-learning-based non-zero sum games for Markov jump multiplayer systems under actor-critic NNs structure

被引：1

作者：

Wang, Yun ^{[1
]}

Xia, Jiawei ^{[2
]}

Wang, Jing ^{[1
]}

Shen, Hao ^{[1
]}

机构：

[1] Anhui Univ Technol, Sch Elect & Informat Engn, Maanshan 243032, Peoples R China

[2] Liaocheng Univ, Sch Math Sci, Liaocheng 252059, Peoples R China

来源：

INFORMATION SCIENCES | 2024年 / 681卷

基金：

中国国家自然科学基金;

关键词：

Markov jump systems; Q-learning; Integral reinforcement learning; Non-zero sum games; ADAPTIVE OPTIMAL-CONTROL; MULTIAGENT SYSTEMS; POWER-SYSTEMS; DESIGN; POLICY;

D O I：

10.1016/j.ins.2024.121196

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This article addresses the problem of non-zero sum games for Markov jump multiplayer systems (MJMSs) using the reinforcement Q-learning method. Firstly, the Q-functions for each player are derived from the system states and the control inputs. On this basis, by incorporating the integral reinforcement learning scheme and the actor-critic neural networks structure, we design a novel reinforcement learning approach for MJMSs. It should be noted that the designed algorithm does not require any information about the system dynamics and transition probabilities. Furthermore, the stochastic stability and Nash equilibrium of MJMSs can be ensured by the designed algorithm. Finally, a simulation example is presented to illustrate the effectiveness of the designed approach.

引用

页数：13

共 15 条

[1] Online reinforcement learning multiplayer non-zero sum games of continuous-time Markov jump linear systems
Xin, Xilin
Tu, Yidong
Stojanovic, Vladimir
Wang, Hai
Shi, Kaibo
He, Shuping
Pan, Tianhong
APPLIED MATHEMATICS AND COMPUTATION, 2022, 412
[2] A Natural Actor-Critic Framework for Zero-Sum Markov Games
Alacaoglu, Ahmet
Viano, Luca
He, Niao
Cevher, Volkan
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 307 - 366
[3] Non-Zero Sum Nash Game for Discrete-Time Infinite Markov Jump Stochastic Systems with Applications
Liu, Yueying
Wang, Zhen
Lin, Xiangyun
AXIOMS, 2023, 12 (09)
[4] Policy iteration-based non-zero sum differential feedback Nash control for continuous-time Markov jump linear systems
Zhu G.-Z.
Zhang M.-G.
He S.-P.
Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2020, 37 (08): : 1749 - 1756
[5] Model-free optimal tracking policies for Markov jump systems by solving non-zero-sum games
Zhou, Peixin
Xue, Huiwen
Wen, Jiwei
Shi, Peng
Luan, Xaoli
INFORMATION SCIENCES, 2023, 647
[6] Non-zero-sum games of discrete-time Markov jump systems with unknown dynamics: An off-policy reinforcement learning method
Zhang, Xuewen
Shen, Hao
Li, Feng
Wang, Jing
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024, 34 (02) : 949 - 968
[7] Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems
Vamvoudakis, Kyriakos G.
AUTOMATICA, 2015, 61 : 274 - 281
[8] Event-Triggered Optimal Tracking Control for Multiplayer Non-Zero-Sum Games of Nonlinear Systems via Concurrent Learning
Qin, Yi
Wang, Lijie
2023 IEEE 12TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE, DDCLS, 2023, : 479 - 484
[9] Zero-sum game-based optimal control for discrete-time Markov jump systems: A parallel off-policy Q-learning method
Wang, Yun
Fang, Tian
Kong, Qingkai
Li, Feng
APPLIED MATHEMATICS AND COMPUTATION, 2024, 467
[10] Zero-sum game-based security control of unknown nonlinear Markov jump systems under false data injection attacks
Gao, Xiaobin
Deng, Feiqi
Zeng, Pengyu
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022,

← 1 2 →