Model-free adaptive optimal control of continuous-time nonlinear non-zero-sum games based on reinforcement learning

被引:5
|
作者
Guo, Lei [1 ]
Zhao, Han [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing 100876, Peoples R China
来源
IET CONTROL THEORY AND APPLICATIONS | 2023年 / 17卷 / 02期
基金
中国国家自然科学基金;
关键词
APPROXIMATE OPTIMAL-CONTROL; LINEAR-SYSTEMS;
D O I
10.1049/cth2.12376
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, two novel algorithms to find the Nash equilibrium solution of the non-zero-sum games for continuous-time input-affine nonlinear systems are presented. Based on integral reinforcement learning method, the integral-exploration-coupled Hamilton-Jacobi (HJ) equations are derived, which does not contain any information of the system dynamics. Then, based on neural networks approximation, two different adaptive tuning law of weights are given to estimate the approximate solution of the coupled HJ equations. Both two algorithms can estimate the value function and the policy without knowing or identifying the system dynamics. The closed-loop system stability and the convergence of weights are guaranteed based on Lyapunov analysis. Finally, the simulation results of a two-player non-zero-sum game demonstrate the effectiveness of our algorithms.
引用
收藏
页码:223 / 239
页数:17
相关论文
共 50 条
  • [41] Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach
    Vamvoudakis, Kyriakos G.
    SYSTEMS & CONTROL LETTERS, 2017, 100 : 14 - 20
  • [42] Model-Free Reinforcement Learning by Embedding an Auxiliary System for Optimal Control of Nonlinear Systems
    Xu, Zhenhui
    Shen, Tielong
    Cheng, Daizhan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1520 - 1534
  • [43] A Single-NN Iterative Adaptive Dynamic Programming Algorithm for Continuous-Time Nonlinear Zero-Sum Games
    Song, Ruizhuo
    Li, Junsong
    2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 2848 - 2853
  • [44] Model-free finite-horizon optimal control of discrete-time two-player zero-sum games
    Wang, Wei
    Chen, Xin
    Du, Jianhua
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2023, 54 (01) : 167 - 179
  • [45] Model-free learning adaptive control for nonlinear systems with multiple time delay
    Hu, Z.Q.
    Li, X.D.
    Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2001, 33 (02): : 261 - 264
  • [46] Online concurrent reinforcement learning algorithm to solve two-player zero-sum games for partially unknown nonlinear continuous-time systems
    Yasini, Sholeh
    Karimpour, Ali
    Sistani, Mohammad-Bagher Naghibi
    Modares, Hamidreza
    INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2015, 29 (04) : 473 - 493
  • [47] Model-Free δ-Policy Iteration Based on Damped Newton Method for Nonlinear Continuous-Time H∞ Tracking Control
    Wang, Qi
    arXiv, 2024,
  • [48] Reinforcement learning based a non-zero-sum game for secure transmission against smart jamming
    Zhao, Chenyu
    Wang, Qing
    Liu, Xiaofeng
    Li, Chun
    Shi, Lidong
    DIGITAL SIGNAL PROCESSING, 2021, 112
  • [49] Adaptive Dynamic Programming for Model-Free Global Stabilization of Control Constrained Continuous-Time Systems
    Rizvi, Syed Ali Asad
    Lin, Zongli
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (02) : 1048 - 1060
  • [50] Optimal model-free adaptive control based on reinforcement Q-Learning for solar thermal collector fields
    Pataro, Igor M. L.
    Cunha, Rita
    Gil, Juan D.
    Guzman, Jose L.
    Berenguel, Manuel
    Lemos, Joao M.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126