Adaptive Observation-Based Efficient Reinforcement Learning for Uncertain Systems

被引:9
|
作者
Ran, Maopeng [1 ]
Xie, Lihua [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
基金
新加坡国家研究基金会;
关键词
Optimal control; Observers; Adaptive systems; Adaptation models; Uncertain systems; Estimation; Data models; Adaptive observer; concurrent learning (CL); optimal control; reinforcement learning (RL); uncertain systems; CONTINUOUS-TIME; PARAMETER-ESTIMATION; LINEAR-SYSTEMS; ITERATION;
D O I
10.1109/TNNLS.2021.3070852
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article develops an adaptive observation-based efficient reinforcement learning (RL) approach for systems with uncertain drift dynamics. A novel concurrent learning adaptive extended observer (CL-AEO) is first designed to jointly estimate the system state and parameter. This observer has a two-time-scale structure and does not require any additional numerical techniques to calculate the state derivative information. The idea of concurrent learning (CL) is leveraged to use the recorded data, which leads to a relaxed verifiable excitation condition for the convergence of parameter estimation. Based on the estimated state and parameter provided by the CL-AEO, a simulation of experience-based RL scheme is developed to online approximate the optimal control policy. Rigorous theoretical analysis is given to show that the practical convergence of the system state to the origin and the developed policy to the ideal optimal policy can be achieved without the persistence of excitation (PE) condition. Finally, the effectiveness and superiority of the developed methodology are demonstrated via comparative simulations.
引用
收藏
页码:5492 / 5503
页数:12
相关论文
共 50 条
  • [31] Use of machine learning to shorten observation-based screening and diagnosis of autism
    D P Wall
    J Kosmicki
    T F DeLuca
    E Harstad
    V A Fusaro
    Translational Psychiatry, 2012, 2 : e100 - e100
  • [32] Deep reinforcement learning-based model predictive control of uncertain linear systems
    Hu, Pengcheng
    Cao, Xinyuan
    Zhang, Kunwu
    Shi, Yang
    2024 IEEE 7TH INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER-PHYSICAL SYSTEMS, ICPS 2024, 2024,
  • [33] Predefined time convergence guaranteed performance control for uncertain systems based on reinforcement learning
    Yin, Chun-Wu
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 140
  • [34] Extended State Observer Based Reinforcement Learning and Disturbance Rejection for Uncertain Nonlinear Systems
    Ran, Maopeng
    Li, Juncheng
    Xie, Lihua
    2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA), 2020, : 1398 - 1403
  • [35] Reinforcement learning to adaptive control of nonlinear systems
    Hwang, KS
    Tan, SW
    Tsai, MC
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2003, 33 (03): : 514 - 521
  • [36] Reinforcement learning based adaptive metaheuristics
    Tessari, Michele
    Iacca, Giovanni
    PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2022, 2022, : 1854 - 1861
  • [37] An Adaptive Authentication Based on Reinforcement Learning
    Cui, Ziqi
    Zhao, Yongxiang
    Li, Chunxi
    Zuo, Qi
    Zhang, Haipeng
    2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TW), 2019,
  • [38] Adaptive immunity based reinforcement learning
    Ito, Jungo
    Nakano, Kazushi
    Sakurama, Kazunori
    Hosokawa, Shu
    ARTIFICIAL LIFE AND ROBOTICS, 2008, 13 (01) : 188 - 193
  • [39] Observation-based evaluation of ensemble reliability
    Yamaguchi, Munehiko
    Lang, Simon T. K.
    Leutbecher, Martin
    Rodwell, Mark J.
    Radnoti, Gabor
    Bormann, Niels
    QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, 2016, 142 (694) : 506 - 514
  • [40] Finite-time adaptive optimal control of uncertain strict-feedback nonlinear systems based on fuzzy observer and reinforcement learning
    Sun, Yue
    Chen, Ming
    Peng, Kaixiang
    Wu, Libing
    Liu, Cungen
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2024, 55 (08) : 1553 - 1570