Generalized Policy Iteration for Continuous-Time Systems

被引:0
|
作者
Vrabie, Draguna [1 ]
Lewis, Frank L. [1 ]
机构
[1] Univ Texas Arlington, Automat & Robot Res Inst, S Ft Worth, TX 76118 USA
关键词
EQUATION; DESIGNS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a unified point of view over the Approximate Dynamic Programming (ADP) algorithms which have been developed in the last years for continuous-time (CT) systems. We introduce here, in a continuous-time formulation, the Generalized Policy Iteration (GPI), and show that in effect it represents a spectrum of algorithms which has at one end the exact Policy Iteration (PI) algorithm and at the other the Value Iteration (VI) algorithm. At the middle part of the spectrum we formulate for the first time the Optimistic Policy Iteration (OPI) algorithm for CT systems. We introduce the GPI starting from a new formulation for the PI algorithm which involves an iterative process to solve for the value function at the policy evaluation step. The GPI algorithm is implemented on an Actor/Critic structure. The results allow implementation of a family of adaptive controllers which converge online to the solution of the optimal control problem, without knowing or identifying the internal dynamics of the system. Simulation results are provided to verify the convergence to the optimal control solution.
引用
收藏
页码:2677 / 2684
页数:8
相关论文
共 50 条
  • [21] Value Iteration for Continuous-Time Linear Time-Invariant Systems
    Possieri, Corrado
    Sassano, Mario
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (05) : 3070 - 3077
  • [22] Near-Optimal Controller for Nonlinear Continuous-Time Systems With Unknown Dynamics Using Policy Iteration
    Dutta, Samrat
    Patchaikani, Prem Kumar
    Behera, Laxmidhar
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (07) : 1537 - 1549
  • [23] Neuro-Control for Continuous-Time Stochastic Nonlinear Systems via Online Policy Iteration Algorithm
    Zhou, Tianmin
    Hou, Jiaxu
    Li, Handong
    Di, Zengru
    Zhao, Bo
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 1499 - 1503
  • [24] Policy Iteration-Based Learning Design for Linear Continuous-Time Systems Under Initial Stabilizing OPFB Policy
    Zhang, Chengye
    Chen, Ci
    Lewis, Frank L.
    Xie, Shengli
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (11) : 6707 - 6718
  • [25] Bias-policy iteration based adaptive dynamic programming for unknown continuous-time linear systems
    Jiang, Huaiyuan
    Zhou, Bin
    AUTOMATICA, 2022, 136
  • [26] Policy Iteration-Based Learning Design for Linear Continuous-Time Systems Under Initial Stabilizing OPFB Policy
    Zhang, Chengye
    Chen, Ci
    Lewis, Frank L.
    Xie, Shengli
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, : 6707 - 6718
  • [27] Bias-policy iteration based optimal control for unknown continuous-time linear periodic systems
    Li, Xiang
    Jiang, Huaiyuan
    Zhou, Bin
    SYSTEMS & CONTROL LETTERS, 2024, 189
  • [28] Continuous-time identification of continuous-time systems
    Kowalczuk, Z
    Kozlowski, J
    (SYSID'97): SYSTEM IDENTIFICATION, VOLS 1-3, 1998, : 1293 - 1298
  • [29] Value Iteration and Adaptive Optimal Control for Linear Continuous-time Systems
    Bian, Tao
    Jiang, Zhong-Ping
    PROCEEDINGS OF THE 2015 7TH IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS (CIS) AND ROBOTICS, AUTOMATION AND MECHATRONICS (RAM), 2015, : 53 - 58
  • [30] On Λ - φ generalized synchronization of chaotic dynamical systems in continuous-time
    Ouannas, A.
    Al-sawalha, M. M.
    EUROPEAN PHYSICAL JOURNAL-SPECIAL TOPICS, 2016, 225 (01): : 187 - 196