Discrete-time dynamic graphical games:model-free reinforcement learning solution

被引:4
|
作者
Mohammed I.ABOUHEAF [1 ]
Frank L.LEWIS [2 ,3 ]
Magdi S.MAHMOUD [1 ]
Dariusz G.MIKULSKI [4 ]
机构
[1] Systems Engineering Department, King Fahd University of Petroleum & Mineral
[2] UTA Research Institute, University of Texas at Arlington,Fort Worth, Texas, U.S.A.
[3] State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Dynamic graphical games; Nash equilibrium; discrete mechanics; optimal control; model-free reinforcement learning; policy iteration;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from multi-agent dynamical systems, where pinning control is used to make all the agents synchronize to the state of a command generator or a leader agent. Novel coupled Bellman equations and Hamiltonian functions are developed for the dynamic graphical games. The Hamiltonian mechanics are used to derive the necessary conditions for optimality. The solution for the dynamic graphical game is given in terms of the solution to a set of coupled Hamilton-Jacobi-Bellman equations developed herein. Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled Hamilton-Jacobi-Bellman equations. An online model-free policy iteration algorithm is developed to learn the Nash solution for the dynamic graphical game. This algorithm does not require any knowledge of the agents’ dynamics. A proof of convergence for this multi-agent learning algorithm is given under mild assumption about the inter-connectivity properties of the graph. A gradient descent technique with critic network structures is used to implement the policy iteration algorithm to solve the graphical game online in real-time.
引用
收藏
页码:55 / 69
页数:15
相关论文
共 50 条
  • [31] Model-free Reinforcement Learning for Non-stationary Mean Field Games
    Mishra, Rajesh K.
    Vasal, Deepanshu
    Vishwanath, Sriram
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 1032 - 1037
  • [32] Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control
    Al-Tamimi, Asma
    Lewis, Frank L.
    Abu-Khalaf, Murad
    AUTOMATICA, 2007, 43 (03) : 473 - 481
  • [33] H∞ Tracking Control for Linear Discrete-Time Systems: Model-Free Q-Learning Designs
    Yang, Yunjie
    Wan, Yan
    Zhu, Jihong
    Lewis, Frank L.
    IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (01): : 175 - 180
  • [34] Hierarchical Dynamic Power Management Using Model-Free Reinforcement Learning
    Wang, Yanzhi
    Triki, Maryam
    Lin, Xue
    Ammari, Ahmed C.
    Pedram, Massoud
    PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2013), 2013, : 170 - 177
  • [35] Finite-Time Model-Free Adaptive Control for Discrete-Time Nonlinear Systems
    Weng, Yongpeng
    Zhang, Qiuxia
    Cao, Jinde
    Yan, Huaicheng
    Qi, Wenhai
    Cheng, Jun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (11) : 4113 - 4117
  • [36] Output synchronization of heterogeneous discrete-time systems: A model-free optimal approach*
    Kiumarsi, Bahare
    Lewis, Frank L.
    AUTOMATICA, 2017, 84 : 86 - 94
  • [37] Model-free optimal control of discrete-time systems with additive and multiplicative noises
    Lai, Jing
    Xiong, Junlin
    Shu, Zhan
    AUTOMATICA, 2023, 147
  • [38] Compact Model-Free Adaptive Control Algorithm for Discrete-Time Nonlinear Systems
    Zhang, Xiaofei
    Ma, Hongbin
    Zhang, Xinghong
    Li, You
    IEEE ACCESS, 2019, 7 : 141062 - 141071
  • [39] Model-free multiobjective approximate dynamic programming for discrete-time nonlinear systems with general performance index functions
    Wei, Qinglai
    Zhang, Huaguang
    Dai, Jing
    NEUROCOMPUTING, 2009, 72 (7-9) : 1839 - 1848
  • [40] Model-Free Reinforcement Learning for Nonlinear Zero-Sum Games with Simultaneous Explorations
    Zhang, Qichao
    Zhao, Donghin
    Zhu, Yuanheng
    Chen, Xi
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 4533 - 4538