Approximate Dynamic Programming Using Model-Free Bellman Residual Elimination

被引:0
|
作者
Bethke, Brett
How, Jonathan P.
机构
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents an modification to the method of Bellman Residual Elimination (BRE) [1], [2] for approximate dynamic programming. While prior work on BRE has focused on learning an approximate policy for an underlying Markov Decision Process (MDP) when the state transition model of the MDP is known, this work proposes a model-free variant of BRE that does not require knowledge of the state transition model. Instead, state trajectories of the system, generated using simulation and/or observations of the real system in operation, are used to build stochastic approximations of the quantities needed to carry out the BRE algorithm. The resulting algorithm can be shown to converge to the policy produced by the nominal, model-based BRE algorithm in the limit of observing an infinite number of trajectories. To validate the performance of the approach, we compare model-based and model-free BRE against LSPI [3], a well-known approximate dynamic programming algorithm. Measuring performance in terms of both computational complexity and policy quality, we present results showing that BRE performs at least as well as, and sometimes significantly better than, LSPI on a standard benchmark problem.
引用
收藏
页码:4146 / 4151
页数:6
相关论文
共 50 条
  • [31] A controller for a magnetic bearing using the Dynamic Programming Method of Bellman
    Steffani, HF
    Hofmann, W
    Cebulski, B
    PROCEEDINGS OF THE SIXTH INTERNATIONAL SYMPOSIUM ON MAGNETIC BEARINGS, 1998, : 569 - 576
  • [32] Model-Free Global Stabilization of Continuous-Time Linear Systems with Saturating Actuators Using Adaptive Dynamic Programming
    Rizvi, Syed Ali Asad
    Lin, Zongli
    2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 145 - 150
  • [33] Multi-Agent Synchronization Using Online Model-Free Action Dependent Dual Heuristic Dynamic Programming Approach
    Abouheaf, Mohammed
    Gueaieb, Wail
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 2195 - 2201
  • [34] Model-Free H Control Design for Unknown Continuous-Time Linear System Using Adaptive Dynamic Programming
    Qin, Chunbin
    Zhang, Huaguang
    Luo, Yanhong
    ASIAN JOURNAL OF CONTROL, 2016, 18 (02) : 609 - 618
  • [35] Hierarchical Dynamic Power Management Using Model-Free Reinforcement Learning
    Wang, Yanzhi
    Triki, Maryam
    Lin, Xue
    Ammari, Ahmed C.
    Pedram, Massoud
    PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2013), 2013, : 170 - 177
  • [36] Model-free quantification of dynamic PET data using nonparametric deconvolution
    Zanderigo, Francesca
    Parsey, Ramin V.
    Ogden, R. Todd
    JOURNAL OF CEREBRAL BLOOD FLOW AND METABOLISM, 2015, 35 (08): : 1368 - 1379
  • [37] Policy Gradient Adaptive Dynamic Programming for Model-Free Multi-Objective Optimal Control
    Zhang, Hao
    Li, Yan
    Wang, Zhuping
    Ding, Yi
    Yan, Huaicheng
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (04) : 1060 - 1062
  • [38] A Hybrid-Adaptive Dynamic Programming Approach for the Model-Free Control of Nonlinear Switched Systems
    Lu, Wenjie
    Zhu, Pingping
    Ferrari, Silvia
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (10) : 3203 - 3208
  • [39] Policy Gradient Adaptive Dynamic Programming for Model-Free Multi-Objective Optimal Control
    Hao Zhang
    Yan Li
    Zhuping Wang
    Yi Ding
    Huaicheng Yan
    IEEE/CAA Journal of Automatica Sinica, 2024, 11 (04) : 1060 - 1062
  • [40] Model-Free Inverse Optimal Control for Completely Unknown Nonlinear Systems by Adaptive Dynamic Programming
    Ahmadi, Peyman
    Rahmani, Mehdi
    Shahmansoorian, Aref
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2025,