Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof

被引:176
|
作者
Al-Tamimi, Asma [1 ]
Lewis, Frank [2 ]
机构
[1] Univ Texas, Automat & Robot Res Inst, Ft Worth, TX 76118 USA
[2] Univ Texas Arlington, Automat & Robot Res Inst, Ft Worth, TX 76118 USA
基金
美国国家科学基金会;
关键词
adaptive critics; approximate dynamic programming; HJB; policy iterations;
D O I
10.1109/ADPRL.2007.368167
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a greedy iteration scheme based on approximate dynamic programming (ADP), namely Heuristic Dynamic Programming (HDP), is used to solve for the value function of the Hamilton Jacobi Bellman equation (HJB) that appears in discrete-time (DT) nonlinear optimal control. Two neural networks are used- one to approximate the value function and one to approximate the optimal control action. The importance of ADP is that it allows one to solve the HJB equation for general nonlinear discrete-time systems by using a neural network to approximate the value function. The importance of this paper is that the proof of convergence of the HDP iteration scheme is provided using rigorous methods for general discrete-time nonlinear systems with continuous state and action spaces. Two examples are provided in this paper. The first example is a linear system, where ADP is found to converge to the correct solution of the Algebraic Riccati equation (ARE). The second example considers a nonlinear control system.
引用
收藏
页码:38 / +
页数:2
相关论文
共 50 条
  • [1] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
    Al-Tamimi, Asma
    Lewis, Frank L.
    Abu-Khalaf, Murad
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04): : 943 - 949
  • [2] Online optimal control of nonlinear discrete-time systems using approximate dynamic programming
    Travis DIERKS
    Sarangapani JAGANNATHAN
    JournalofControlTheoryandApplications, 2011, 9 (03) : 361 - 369
  • [3] Online optimal control of nonlinear discrete-time systems using approximate dynamic programming
    Dierks T.
    Jagannathan S.
    Journal of Control Theory and Applications, 2011, 9 (3): : 361 - 369
  • [4] Finite horizon discrete-time approximate dynamic programming
    Liu, Derong
    Jin, Ning
    PROCEEDINGS OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL, 2006, : 75 - +
  • [5] Adaptive dynamic programming-based optimal control of unknown nonaffine nonlinear discrete-time systems with proof of convergence
    Zhang, Xin
    Zhang, Huaguang
    Sun, Qiuye
    Luo, Yanhong
    NEUROCOMPUTING, 2012, 91 : 48 - 55
  • [6] Discrete-Time Optimal Control of State-Constrained Nonlinear Systems Using Approximate Dynamic Programming
    Song, Shijie
    Gong, Dawei
    Zhu, Minglei
    Zhao, Yuyang
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2025, 35 (03) : 858 - 871
  • [7] Policy Approximation in Policy Iteration Approximate Dynamic Programming for Discrete-Time Nonlinear Systems
    Guo, Wentao
    Si, Jennie
    Liu, Feng
    Mei, Shengwei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) : 2794 - 2807
  • [8] Approximate Optimal tracking Control for Nonlinear Discrete-time Switched Systems via Approximate Dynamic Programming
    Qin, Chunbin
    Huang, Yizhe
    Yang, Yabin
    Zhang, Jishi
    Liu, Xianxing
    PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 1456 - 1461
  • [9] Convergence analysis and application of fuzzy-HDP for nonlinear discrete-time HJB systems
    Zhu, Yuanheng
    Zhao, Dongbin
    Liu, Derong
    NEUROCOMPUTING, 2015, 149 : 124 - 131
  • [10] CONVERGENCE IN UNCONSTRAINED DISCRETE-TIME DIFFERENTIAL DYNAMIC-PROGRAMMING
    LIAO, LZ
    SHOEMAKER, CA
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1991, 36 (06) : 692 - 706