Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof

被引：176

作者：

Al-Tamimi, Asma ^{[1
]}

Lewis, Frank ^{[2
]}

机构：

[1] Univ Texas, Automat & Robot Res Inst, Ft Worth, TX 76118 USA

[2] Univ Texas Arlington, Automat & Robot Res Inst, Ft Worth, TX 76118 USA

来源：

2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING | 2007年

基金：

美国国家科学基金会;

关键词：

adaptive critics; approximate dynamic programming; HJB; policy iterations;

D O I：

10.1109/ADPRL.2007.368167

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, a greedy iteration scheme based on approximate dynamic programming (ADP), namely Heuristic Dynamic Programming (HDP), is used to solve for the value function of the Hamilton Jacobi Bellman equation (HJB) that appears in discrete-time (DT) nonlinear optimal control. Two neural networks are used- one to approximate the value function and one to approximate the optimal control action. The importance of ADP is that it allows one to solve the HJB equation for general nonlinear discrete-time systems by using a neural network to approximate the value function. The importance of this paper is that the proof of convergence of the HDP iteration scheme is provided using rigorous methods for general discrete-time nonlinear systems with continuous state and action spaces. Two examples are provided in this paper. The first example is a linear system, where ADP is found to converge to the correct solution of the Algebraic Riccati equation (ARE). The second example considers a nonlinear control system.

引用

页码：38 / +

页数：2

共 50 条

[1] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
Al-Tamimi, Asma
Lewis, Frank L.
Abu-Khalaf, Murad
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04): : 943 - 949
[2] Online optimal control of nonlinear discrete-time systems using approximate dynamic programming
Travis DIERKS
Sarangapani JAGANNATHAN
JournalofControlTheoryandApplications, 2011, 9 (03) : 361 - 369
[3] Online optimal control of nonlinear discrete-time systems using approximate dynamic programming
Dierks T.
Jagannathan S.
Journal of Control Theory and Applications, 2011, 9 (3): : 361 - 369
[4] Finite horizon discrete-time approximate dynamic programming
Liu, Derong
Jin, Ning
PROCEEDINGS OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL, 2006, : 75 - +
[5] Adaptive dynamic programming-based optimal control of unknown nonaffine nonlinear discrete-time systems with proof of convergence
Zhang, Xin
Zhang, Huaguang
Sun, Qiuye
Luo, Yanhong
NEUROCOMPUTING, 2012, 91 : 48 - 55
[6] Discrete-Time Optimal Control of State-Constrained Nonlinear Systems Using Approximate Dynamic Programming
Song, Shijie
Gong, Dawei
Zhu, Minglei
Zhao, Yuyang
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2025, 35 (03) : 858 - 871
[7] Policy Approximation in Policy Iteration Approximate Dynamic Programming for Discrete-Time Nonlinear Systems
Guo, Wentao
Si, Jennie
Liu, Feng
Mei, Shengwei
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) : 2794 - 2807
[8] Approximate Optimal tracking Control for Nonlinear Discrete-time Switched Systems via Approximate Dynamic Programming
Qin, Chunbin
Huang, Yizhe
Yang, Yabin
Zhang, Jishi
Liu, Xianxing
PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 1456 - 1461
[9] Convergence analysis and application of fuzzy-HDP for nonlinear discrete-time HJB systems
Zhu, Yuanheng
Zhao, Dongbin
Liu, Derong
NEUROCOMPUTING, 2015, 149 : 124 - 131
[10] CONVERGENCE IN UNCONSTRAINED DISCRETE-TIME DIFFERENTIAL DYNAMIC-PROGRAMMING
LIAO, LZ
SHOEMAKER, CA
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1991, 36 (06) : 692 - 706

← 1 2 3 4 5 →