Off-Policy Actor-Critic Structure for Optimal Control of Unknown Systems With Disturbances

被引:197
|
作者
Song, Ruizhuo [1 ]
Lewis, Frank L. [2 ,3 ]
Wei, Qinglai [4 ]
Zhang, Huaguang [5 ]
机构
[1] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China
[2] Univ Texas Arlington, UTA Res Inst, Ft Worth, TX 76118 USA
[3] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110004, Peoples R China
[4] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[5] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110004, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金; 美国国家科学基金会;
关键词
Adaptive critic designs; adaptive/approximate dynamic programming (ADP); dynamic programming; off-policy; optimal control; unknown system; OPTIMAL TRACKING CONTROL; ADAPTIVE OPTIMAL-CONTROL; TIME NONLINEAR-SYSTEMS; OPTIMAL-CONTROL SCHEME; FEEDBACK-CONTROL; ALGORITHM; ITERATION; DESIGN;
D O I
10.1109/TCYB.2015.2421338
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An optimal control method is developed for unknown continuous-time systems with unknown disturbances in this paper. The integral reinforcement learning (IRL) algorithm is presented to obtain the iterative control. Off-policy learning is used to allow the dynamics to be completely unknown. Neural networks are used to construct critic and action networks. It is shown that if there are unknown disturbances, off-policy IRL may not converge or may be biased. For reducing the influence of unknown disturbances, a disturbances compensation controller is added. It is proven that the weight errors are uniformly ultimately bounded based on Lyapunov techniques. Convergence of the Hamiltonian function is also proven. The simulation study demonstrates the effectiveness of the proposed optimal control method for unknown systems with disturbances.
引用
收藏
页码:1041 / 1050
页数:10
相关论文
共 50 条
  • [21] Finite-Sample Analysis of Off-Policy Natural Actor-Critic With Linear Function Approximation
    Chen, Zaiwei
    Khodadadian, Sajad
    Maguluri, Siva Theja
    IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 2611 - 2616
  • [22] Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors
    Duan, Jingliang
    Guan, Yang
    Li, Shengbo Eben
    Ren, Yangang
    Sun, Qi
    Cheng, Bo
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (11) : 6584 - 6598
  • [23] Optimal Actor-Critic Policy With Optimized Training Datasets
    Banerjee, Chayan
    Chen, Zhiyong
    Noman, Nasimul
    Zamani, Mohsen
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2022, 6 (06): : 1324 - 1334
  • [24] Multi-agent Gradient-Based Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning
    Ren, Jineng
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
  • [25] Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning
    Stankovic, Milos S.
    Beko, Marko
    Ilic, Nemanja
    Stankovic, Srdjan S.
    EUROPEAN JOURNAL OF CONTROL, 2023, 74
  • [26] Episode-Experience Replay Based Tree-Backup Method for Off-Policy Actor-Critic Algorithm
    Jiang, Haobo
    Qian, Jianjun
    Xie, Jin
    Yang, Jian
    PATTERN RECOGNITION AND COMPUTER VISION (PRCV 2018), PT I, 2018, 11256 : 562 - 573
  • [27] Cooperative traffic signal control using Multi-step return and Off-policy Asynchronous Advantage Actor-Critic Graph algorithm
    Yang, Shantian
    Yang, Bo
    Wong, Hau-San
    Kang, Zhongfeng
    KNOWLEDGE-BASED SYSTEMS, 2019, 183
  • [28] Off-policy algorithm based Hierarchical optimal control for completely unknown dynamic systems
    Cui, Xiaohong
    Chen, Jiayu
    Wang, Binrui
    Xu, Suan
    NEUROCOMPUTING, 2022, 488 : 669 - 680
  • [29] Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model Misspecification
    Tanabe, Takumi
    Sato, Rei
    Fukuchi, Kazuto
    Sakuma, Jun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [30] Online Actor-critic Reinforcement Learning Control for Uncertain Surface Vessel Systems with External Disturbances
    Vu, Van Tu
    Tran, Quang Huy
    Pham, Thanh Loc
    Dao, Phuong Nam
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2022, 20 (03) : 1029 - 1040