Approximate Dynamic Programming Using Model-Free Bellman Residual Elimination

被引：0

作者：

Bethke, Brett

How, Jonathan P.

机构：

来源：

2010 AMERICAN CONTROL CONFERENCE | 2010年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents an modification to the method of Bellman Residual Elimination (BRE) [1], [2] for approximate dynamic programming. While prior work on BRE has focused on learning an approximate policy for an underlying Markov Decision Process (MDP) when the state transition model of the MDP is known, this work proposes a model-free variant of BRE that does not require knowledge of the state transition model. Instead, state trajectories of the system, generated using simulation and/or observations of the real system in operation, are used to build stochastic approximations of the quantities needed to carry out the BRE algorithm. The resulting algorithm can be shown to converge to the policy produced by the nominal, model-based BRE algorithm in the limit of observing an infinite number of trajectories. To validate the performance of the approach, we compare model-based and model-free BRE against LSPI [3], a well-known approximate dynamic programming algorithm. Measuring performance in terms of both computational complexity and policy quality, we present results showing that BRE performs at least as well as, and sometimes significantly better than, LSPI on a standard benchmark problem.

引用

页码：4146 / 4151

页数：6

共 50 条

[31] A controller for a magnetic bearing using the Dynamic Programming Method of Bellman
Steffani, HF
Hofmann, W
Cebulski, B
PROCEEDINGS OF THE SIXTH INTERNATIONAL SYMPOSIUM ON MAGNETIC BEARINGS, 1998, : 569 - 576
[32] Model-Free Global Stabilization of Continuous-Time Linear Systems with Saturating Actuators Using Adaptive Dynamic Programming
Rizvi, Syed Ali Asad
Lin, Zongli
2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 145 - 150
[33] Multi-Agent Synchronization Using Online Model-Free Action Dependent Dual Heuristic Dynamic Programming Approach
Abouheaf, Mohammed
Gueaieb, Wail
2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 2195 - 2201
[34] Model-Free H Control Design for Unknown Continuous-Time Linear System Using Adaptive Dynamic Programming
Qin, Chunbin
Zhang, Huaguang
Luo, Yanhong
ASIAN JOURNAL OF CONTROL, 2016, 18 (02) : 609 - 618
[35] Hierarchical Dynamic Power Management Using Model-Free Reinforcement Learning
Wang, Yanzhi
Triki, Maryam
Lin, Xue
Ammari, Ahmed C.
Pedram, Massoud
PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2013), 2013, : 170 - 177
[36] Model-free quantification of dynamic PET data using nonparametric deconvolution
Zanderigo, Francesca
Parsey, Ramin V.
Ogden, R. Todd
JOURNAL OF CEREBRAL BLOOD FLOW AND METABOLISM, 2015, 35 (08): : 1368 - 1379
[37] Policy Gradient Adaptive Dynamic Programming for Model-Free Multi-Objective Optimal Control
Zhang, Hao
Li, Yan
Wang, Zhuping
Ding, Yi
Yan, Huaicheng
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (04) : 1060 - 1062
[38] A Hybrid-Adaptive Dynamic Programming Approach for the Model-Free Control of Nonlinear Switched Systems
Lu, Wenjie
Zhu, Pingping
Ferrari, Silvia
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (10) : 3203 - 3208
[39] Policy Gradient Adaptive Dynamic Programming for Model-Free Multi-Objective Optimal Control
Hao Zhang
Yan Li
Zhuping Wang
Yi Ding
Huaicheng Yan
IEEE/CAA Journal of Automatica Sinica, 2024, 11 (04) : 1060 - 1062
[40] Model-Free Inverse Optimal Control for Completely Unknown Nonlinear Systems by Adaptive Dynamic Programming
Ahmadi, Peyman
Rahmani, Mehdi
Shahmansoorian, Aref
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2025,

← 1 2 3 4 5 →