Generalized Policy Iteration for Continuous-Time Systems

被引:0
|
作者
Vrabie, Draguna [1 ]
Lewis, Frank L. [1 ]
机构
[1] Univ Texas Arlington, Automat & Robot Res Inst, S Ft Worth, TX 76118 USA
关键词
EQUATION; DESIGNS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a unified point of view over the Approximate Dynamic Programming (ADP) algorithms which have been developed in the last years for continuous-time (CT) systems. We introduce here, in a continuous-time formulation, the Generalized Policy Iteration (GPI), and show that in effect it represents a spectrum of algorithms which has at one end the exact Policy Iteration (PI) algorithm and at the other the Value Iteration (VI) algorithm. At the middle part of the spectrum we formulate for the first time the Optimistic Policy Iteration (OPI) algorithm for CT systems. We introduce the GPI starting from a new formulation for the PI algorithm which involves an iterative process to solve for the value function at the policy evaluation step. The GPI algorithm is implemented on an Actor/Critic structure. The results allow implementation of a family of adaptive controllers which converge online to the solution of the optimal control problem, without knowing or identifying the internal dynamics of the system. Simulation results are provided to verify the convergence to the optimal control solution.
引用
收藏
页码:2677 / 2684
页数:8
相关论文
共 50 条
  • [41] On the generalized algebraic Riccati equation for continuous-time descriptor systems
    Kawamoto, A
    Takaba, K
    Katayama, T
    LINEAR ALGEBRA AND ITS APPLICATIONS, 1999, 296 (1-3) : 1 - 14
  • [42] Continuous-Time Policy Optimization
    Zhan, Guojian
    Jiang, Yuxuan
    Duan, Jingliang
    Li, Shengbo Eben
    Cheng, Bo
    Li, Keqiang
    2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 3382 - 3388
  • [43] Policy Iteration-Mode Monotone Convergence of Generalized Policy Iteration for Discrete-Time Linear Systems
    Chun, Tae Yoon
    Park, Jin Bae
    Choi, Yoon Ho
    2013 13TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2013), 2013, : 454 - 458
  • [44] Online adaptive optimal control for continuous-time Markov jump linear systems using a novel policy iteration algorithm
    He, Shuping
    Song, Jun
    Ding, Zhengtao
    Liu, Fei
    IET CONTROL THEORY AND APPLICATIONS, 2015, 9 (10): : 1536 - 1543
  • [45] Data-driven policy iteration algorithm for optimal control of continuous-time Ito stochastic systems with Markovian jumps
    Song, Jun
    He, Shuping
    Liu, Fei
    Niu, Yugang
    Ding, Zhengtao
    IET CONTROL THEORY AND APPLICATIONS, 2016, 10 (12): : 1431 - 1439
  • [46] Continuous-time systems
    不详
    LINEAR TIME VARYING SYSTEMS AND SAMPLED-DATA SYSTEMS, 2001, 265 : 7 - 94
  • [47] Continuous-Time Fitted Value Iteration for Robust Policies
    Lutter, Michael
    Belousov, Boris
    Mannor, Shie
    Fox, Dieter
    Garg, Animesh
    Peters, Jan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) : 5534 - 5548
  • [48] Generalized continuous-time Riccati theory
    Department of Control Engineering, University Politehnica Bucharest, Bucharest
    不详
    Linear Algebra Its Appl, 1-3 (111-130):
  • [49] Generalized continuous-time Riccati theory
    Ionescu, V
    Oara, C
    LINEAR ALGEBRA AND ITS APPLICATIONS, 1996, 232 : 111 - 130
  • [50] Convergence Analysis of Value Iteration Adaptive Dynamic Programming for Continuous-Time Nonlinear Systems
    Xiao, Geyang
    Zhang, Huaguang
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (03) : 1639 - 1649