Adaptive Observation-Based Efficient Reinforcement Learning for Uncertain Systems

被引：9

作者：

Ran, Maopeng ^{[1
]}

Xie, Lihua ^{[1
]}

机构：

[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2022年 / 33卷 / 10期

基金：

新加坡国家研究基金会;

关键词：

Optimal control; Observers; Adaptive systems; Adaptation models; Uncertain systems; Estimation; Data models; Adaptive observer; concurrent learning (CL); optimal control; reinforcement learning (RL); uncertain systems; CONTINUOUS-TIME; PARAMETER-ESTIMATION; LINEAR-SYSTEMS; ITERATION;

D O I：

10.1109/TNNLS.2021.3070852

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article develops an adaptive observation-based efficient reinforcement learning (RL) approach for systems with uncertain drift dynamics. A novel concurrent learning adaptive extended observer (CL-AEO) is first designed to jointly estimate the system state and parameter. This observer has a two-time-scale structure and does not require any additional numerical techniques to calculate the state derivative information. The idea of concurrent learning (CL) is leveraged to use the recorded data, which leads to a relaxed verifiable excitation condition for the convergence of parameter estimation. Based on the estimated state and parameter provided by the CL-AEO, a simulation of experience-based RL scheme is developed to online approximate the optimal control policy. Rigorous theoretical analysis is given to show that the practical convergence of the system state to the origin and the developed policy to the ideal optimal policy can be achieved without the persistence of excitation (PE) condition. Finally, the effectiveness and superiority of the developed methodology are demonstrated via comparative simulations.

引用

页码：5492 / 5503

页数：12

共 50 条

[31] Use of machine learning to shorten observation-based screening and diagnosis of autism
D P Wall
J Kosmicki
T F DeLuca
E Harstad
V A Fusaro
Translational Psychiatry, 2012, 2 : e100 - e100
[32] Deep reinforcement learning-based model predictive control of uncertain linear systems
Hu, Pengcheng
Cao, Xinyuan
Zhang, Kunwu
Shi, Yang
2024 IEEE 7TH INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER-PHYSICAL SYSTEMS, ICPS 2024, 2024,
[33] Predefined time convergence guaranteed performance control for uncertain systems based on reinforcement learning
Yin, Chun-Wu
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 140
[34] Extended State Observer Based Reinforcement Learning and Disturbance Rejection for Uncertain Nonlinear Systems
Ran, Maopeng
Li, Juncheng
Xie, Lihua
2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA), 2020, : 1398 - 1403
[35] Reinforcement learning to adaptive control of nonlinear systems
Hwang, KS
Tan, SW
Tsai, MC
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2003, 33 (03): : 514 - 521
[36] Reinforcement learning based adaptive metaheuristics
Tessari, Michele
Iacca, Giovanni
PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2022, 2022, : 1854 - 1861
[37] An Adaptive Authentication Based on Reinforcement Learning
Cui, Ziqi
Zhao, Yongxiang
Li, Chunxi
Zuo, Qi
Zhang, Haipeng
2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TW), 2019,
[38] Adaptive immunity based reinforcement learning
Ito, Jungo
Nakano, Kazushi
Sakurama, Kazunori
Hosokawa, Shu
ARTIFICIAL LIFE AND ROBOTICS, 2008, 13 (01) : 188 - 193
[39] Observation-based evaluation of ensemble reliability
Yamaguchi, Munehiko
Lang, Simon T. K.
Leutbecher, Martin
Rodwell, Mark J.
Radnoti, Gabor
Bormann, Niels
QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, 2016, 142 (694) : 506 - 514
[40] Finite-time adaptive optimal control of uncertain strict-feedback nonlinear systems based on fuzzy observer and reinforcement learning
Sun, Yue
Chen, Ming
Peng, Kaixiang
Wu, Libing
Liu, Cungen
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2024, 55 (08) : 1553 - 1570

← 1 2 3 4 5 →