Model-free optimal tracking policies for Markov jump systems by solving non-zero-sum games

被引:3
|
作者
Zhou, Peixin [1 ]
Xue, Huiwen [1 ]
Wen, Jiwei [1 ]
Shi, Peng [2 ,3 ]
Luan, Xaoli [1 ]
机构
[1] Jiangnan Univ, Sch Internet Things Engn, Key Lab Adv Proc Control Light Ind, Minist Educ, Wuxi 214122, Peoples R China
[2] Univ Adelaide, Sch Elect & Mech Engn, Adelaide, SA 5005, Australia
[3] Obuda Univ, Res & Innovat Ctr, H-1034 Budapest, Hungary
基金
中国国家自然科学基金;
关键词
Value iteration algorithm; Influence function; Adaptive optimal tracking; Non-zero-sum game; Nash equilibrium;
D O I
10.1016/j.ins.2023.119423
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper develops model-free optimal tracking policies for Markov jump systems by solving nonzero-sum games (NZSGs). First, coupled action and mode-dependent value functions (CAMDVFs) are built for solving a two-player NZSG and getting Nash equilibrium solutions. Second, we propose a value iteration (VI) algorithm to parallelly update policies under each mode by collecting data on different operation modes within each iterative window. Moreover, the iterative increasing convergence of the CAMDVFs is proved by introducing auxiliary functions between two adjacent iterations. It is worth pointing out that an influence function is introduced to remove abnormal data to improve the learning capability of the VI algorithm effectively. Finally, the tracking policies' validity, self-adaptability and application potential are verified by a numerical example and a generalized economic model.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Non-Zero Sum Nash Game for Discrete-Time Infinite Markov Jump Stochastic Systems with Applications
    Liu, Yueying
    Wang, Zhen
    Lin, Xiangyun
    AXIOMS, 2023, 12 (09)
  • [42] Adaptive optimal safety tracking control for multiplayer mixed zero-sum games of continuous-time systems
    Qin, Chunbin
    Zhang, Zhongwei
    Shang, Ziyang
    Zhang, Jishi
    Zhang, Dehua
    APPLIED INTELLIGENCE, 2023, 53 (14) : 17460 - 17475
  • [43] Event-Triggered Adaptive Dynamic Programming for Non-Zero-Sum Games of Unknown Nonlinear Systems via Generalized Fuzzy Hyperbolic Models
    Zhang, Huaguang
    Su, Hanguang
    Zhang, Kun
    Luo, Yanhong
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2019, 27 (11) : 2202 - 2214
  • [44] Adaptive optimal safety tracking control for multiplayer mixed zero-sum games of continuous-time systems
    Chunbin Qin
    Zhongwei Zhang
    Ziyang Shang
    Jishi Zhang
    Dehua Zhang
    Applied Intelligence, 2023, 53 : 17460 - 17475
  • [45] Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games With Control Constraints
    Liu, Pengda
    Zhang, Huaguang
    Liu, Chong
    Su, Hanguang
    IEEE ACCESS, 2020, 8 : 182295 - 182306
  • [46] Model-free Adaptive Dynamic Programming for Online optimal Solution of the Unknown Nonlinear Zero-Sum Differential Game
    Qin, Chunbin
    Zhang, Huaguang
    Luo, Yanhong
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 3815 - 3820
  • [47] Integral reinforcement learning-based event-triggered optimal tracking control for modular robot manipulators via non-zero-sum game
    Dong, Bo
    Ding, Zhendong
    An, Tianjiao
    Cui, Yiming
    Zhu, Xinye
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (09)
  • [48] Completely model-free approximate optimal tracking control for continuous-time nonlinear systems
    Xu, Zhenhui
    Shen, Tielong
    2020 59TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2020, : 239 - 244
  • [49] Integral reinforcement learning-based online adaptive event-triggered control for non-zero-sum games of partially unknown nonlinear systems
    Su, Hanguang
    Zhang, Huaguang
    Sun, Shaoxin
    Cai, Yuliang
    NEUROCOMPUTING, 2020, 377 : 243 - 255
  • [50] Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity
    Zhang, Kaiqing
    Kakade, Sham M.
    Basar, Tamer
    Yang, Lin F.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33