Off-Policy Actor-Critic Structure for Optimal Control of Unknown Systems With Disturbances

被引:197
|
作者
Song, Ruizhuo [1 ]
Lewis, Frank L. [2 ,3 ]
Wei, Qinglai [4 ]
Zhang, Huaguang [5 ]
机构
[1] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China
[2] Univ Texas Arlington, UTA Res Inst, Ft Worth, TX 76118 USA
[3] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110004, Peoples R China
[4] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[5] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110004, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金; 美国国家科学基金会;
关键词
Adaptive critic designs; adaptive/approximate dynamic programming (ADP); dynamic programming; off-policy; optimal control; unknown system; OPTIMAL TRACKING CONTROL; ADAPTIVE OPTIMAL-CONTROL; TIME NONLINEAR-SYSTEMS; OPTIMAL-CONTROL SCHEME; FEEDBACK-CONTROL; ALGORITHM; ITERATION; DESIGN;
D O I
10.1109/TCYB.2015.2421338
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An optimal control method is developed for unknown continuous-time systems with unknown disturbances in this paper. The integral reinforcement learning (IRL) algorithm is presented to obtain the iterative control. Off-policy learning is used to allow the dynamics to be completely unknown. Neural networks are used to construct critic and action networks. It is shown that if there are unknown disturbances, off-policy IRL may not converge or may be biased. For reducing the influence of unknown disturbances, a disturbances compensation controller is added. It is proven that the weight errors are uniformly ultimately bounded based on Lyapunov techniques. Convergence of the Hamiltonian function is also proven. The simulation study demonstrates the effectiveness of the proposed optimal control method for unknown systems with disturbances.
引用
收藏
页码:1041 / 1050
页数:10
相关论文
共 50 条
  • [41] Boosting On-Policy Actor-Critic With Shallow Updates in Critic
    Li, Luntong
    Zhu, Yuanheng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 10
  • [42] Optimal Tracking Control for Robotic Manipulator using Actor-Critic Network
    Hu, Yong
    Cui, Lingguo
    Chai, Senchun
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 1556 - 1561
  • [43] Hierarchical Sliding-Mode Surface-Based Adaptive Actor-Critic Optimal Control for Switched Nonlinear Systems With Unknown Perturbation
    Zhang, Haoyan
    Zhao, Xudong
    Wang, Huanqing
    Zong, Guangdeng
    Xu, Ning
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 1559 - 1571
  • [44] Off-policy neuro-optimal control for unknown complex-valued nonlinear systems based on policy iteration
    Ruizhuo Song
    Qinglai Wei
    Wendong Xiao
    Neural Computing and Applications, 2017, 28 : 1435 - 1441
  • [45] Relaxed Actor-Critic With Convergence Guarantees for Continuous-Time Optimal Control of Nonlinear Systems
    Duan, Jingliang
    Li, Jie
    Ge, Qiang
    Li, Shengbo Eben
    Bujarbaruah, Monimoy
    Ma, Fei
    Zhang, Dezhao
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (05): : 3299 - 3311
  • [46] Receding Horizon Actor-Critic Learning Control for Nonlinear Time-Delay Systems With Unknown Dynamics
    Liu, Jiahang
    Zhang, Xinglong
    Xu, Xin
    Xiong, Quan
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (08): : 4980 - 4993
  • [47] Fast and stable learning of quasi-passive dynamic walking by an unstable biped robot based on off-policy natural actor-critic
    Ueno, Tsuyoshi
    Nakamura, Yutaka
    Takuma, Takashi
    Shibata, Tomohiro
    Hosoda, Koh
    Ishii, Shin
    2006 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-12, 2006, : 5226 - +
  • [48] H∞ Optimal Control of Unknown Linear Discrete-time Systems: An Off-policy Reinforcement Learning Approach
    Kiumarsi, Bahare
    Modares, Hamidreza
    Lewis, Frank L.
    Jiang, Zhong-Ping
    PROCEEDINGS OF THE 2015 7TH IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS (CIS) AND ROBOTICS, AUTOMATION AND MECHATRONICS (RAM), 2015, : 41 - 46
  • [49] An Actor-Critic approach for control of Residential Photovoltaic-Battery Systems
    Joshi, Amit
    Tipaldi, Massimo
    Glielmo, Luigi
    IFAC PAPERSONLINE, 2021, 54 (07): : 222 - 227
  • [50] Adaptive actor-critic structure for parametrized controllers
    Goehrt, Thomas
    Osinenko, Pavel
    Streif, Stefan
    IFAC PAPERSONLINE, 2019, 52 (16): : 652 - 657