Generalized Policy Iteration for Continuous-Time Systems

被引:0
|
作者
Vrabie, Draguna [1 ]
Lewis, Frank L. [1 ]
机构
[1] Univ Texas Arlington, Automat & Robot Res Inst, S Ft Worth, TX 76118 USA
关键词
EQUATION; DESIGNS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a unified point of view over the Approximate Dynamic Programming (ADP) algorithms which have been developed in the last years for continuous-time (CT) systems. We introduce here, in a continuous-time formulation, the Generalized Policy Iteration (GPI), and show that in effect it represents a spectrum of algorithms which has at one end the exact Policy Iteration (PI) algorithm and at the other the Value Iteration (VI) algorithm. At the middle part of the spectrum we formulate for the first time the Optimistic Policy Iteration (OPI) algorithm for CT systems. We introduce the GPI starting from a new formulation for the PI algorithm which involves an iterative process to solve for the value function at the policy evaluation step. The GPI algorithm is implemented on an Actor/Critic structure. The results allow implementation of a family of adaptive controllers which converge online to the solution of the optimal control problem, without knowing or identifying the internal dynamics of the system. Simulation results are provided to verify the convergence to the optimal control solution.
引用
收藏
页码:2677 / 2684
页数:8
相关论文
共 50 条
  • [31] Average optimality for continuous-time Markov decision processes with a policy iteration approach
    Zhu, Quanxin
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2008, 339 (01) : 691 - 704
  • [32] Linear generalized synchronization of continuous-time chaotic systems
    Lu, JG
    Xi, YG
    CHAOS SOLITONS & FRACTALS, 2003, 17 (05) : 825 - 831
  • [33] Policy-Iteration-Based Adaptive Optimal Control for Uncertain Continuous-Time Linear Systems with Excitation Signals
    Lee, Jae Young
    Park, Jin Bae
    Choi, Yoon Ho
    INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2010), 2010, : 646 - 651
  • [34] Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems
    Lee, Jae Young
    Park, Jin Bae
    Choi, Yoon Ho
    AUTOMATICA, 2012, 48 (11) : 2850 - 2859
  • [35] Finite horizon optimal tracking control of partially unknown linear continuous-time systems using policy iteration
    Li, Chao
    Liu, Derong
    Li, Hongliang
    IET CONTROL THEORY AND APPLICATIONS, 2015, 9 (12): : 1791 - 1801
  • [36] Policy Iteration-based Indirect Adaptive Optimal Control for Completely Unknown Continuous-Time LTI Systems
    Jha, Sumit Kumar
    Roy, Sayan Basu
    Bhasin, Shubhendu
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 448 - 454
  • [37] Continuous-time approaches to identification of continuous-time systems
    Kowalczuk, Z
    Kozlowski, J
    AUTOMATICA, 2000, 36 (08) : 1229 - 1236
  • [38] Policy Iteration for Continuous-Time Average Reward Markov Decision Processes in Polish Spaces
    Zhu, Quanxin
    Yang, Xinsong
    Huang, Chuangxia
    ABSTRACT AND APPLIED ANALYSIS, 2009,
  • [39] Adaptive Optimal Control of Continuous-Time Linear Systems via Hybrid Iteration
    Qasem, Omar
    Gao, Weinan
    Bian, Tao
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [40] Recent developments in generalized predictive control for continuous-time systems
    Kouvaritakis, B
    Cannon, M
    Rossiter, JA
    INTERNATIONAL JOURNAL OF CONTROL, 1999, 72 (02) : 164 - 173