Approximate Dynamic Programming Using Model-Free Bellman Residual Elimination

被引:0
|
作者
Bethke, Brett
How, Jonathan P.
机构
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents an modification to the method of Bellman Residual Elimination (BRE) [1], [2] for approximate dynamic programming. While prior work on BRE has focused on learning an approximate policy for an underlying Markov Decision Process (MDP) when the state transition model of the MDP is known, this work proposes a model-free variant of BRE that does not require knowledge of the state transition model. Instead, state trajectories of the system, generated using simulation and/or observations of the real system in operation, are used to build stochastic approximations of the quantities needed to carry out the BRE algorithm. The resulting algorithm can be shown to converge to the policy produced by the nominal, model-based BRE algorithm in the limit of observing an infinite number of trajectories. To validate the performance of the approach, we compare model-based and model-free BRE against LSPI [3], a well-known approximate dynamic programming algorithm. Measuring performance in terms of both computational complexity and policy quality, we present results showing that BRE performs at least as well as, and sometimes significantly better than, LSPI on a standard benchmark problem.
引用
收藏
页码:4146 / 4151
页数:6
相关论文
共 50 条
  • [11] Approximate dynamic programming via iterated Bellman inequalities
    Wang, Yang
    O'Donoghue, Brendan
    Boyd, Stephen
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2015, 25 (10) : 1472 - 1496
  • [12] Model-free model elimination: A new step in the model-free dynamic analysis of NMR relaxation data
    Edward J. d’Auvergne
    Paul R. Gooley
    Journal of Biomolecular NMR, 2006, 35
  • [13] Model-free adaptive dynamic programming for unknown systems
    Abu-Khalaf, Murad
    Lewis, Frank L.
    Al-Tamimi, Asma
    Vrabie, Draguna
    ICCSE'2006: PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION: ADVANCED COMPUTER TECHNOLOGY, NEW EDUCATION, 2006, : 105 - 114
  • [14] Model-free optimal tracking over finite horizon using adaptive dynamic programming
    Jha, Mayank Shekhar
    Theilliol, Didier
    Weber, Philippe
    OPTIMAL CONTROL APPLICATIONS & METHODS, 2023, 44 (06): : 3114 - 3138
  • [15] Model-free multiobjective approximate dynamic programming for discrete-time nonlinear systems with general performance index functions
    Wei, Qinglai
    Zhang, Huaguang
    Dai, Jing
    NEUROCOMPUTING, 2009, 72 (7-9) : 1839 - 1848
  • [16] Model-Free Optimal Stabilization of Unknown Time Delay Systems Using Adaptive Dynamic Programming
    Rizvi, Syed Ali Asad
    Wei, Yusheng
    Lin, Zongli
    2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 6536 - 6541
  • [17] Empirical dynamic programming for model-free ecosystem-based management
    Munch, Stephan B.
    Brias, Antoine
    METHODS IN ECOLOGY AND EVOLUTION, 2024, 15 (04): : 769 - 778
  • [18] Hamilton-Jacobi-Bellman equations and approximate dynamic programming on time scales
    Seiffertt, John
    Sanyal, Suman
    Wunsch, Donald C., II
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04): : 918 - 923
  • [19] Model-Free Composite Control of Flexible Manipulators Based on Adaptive Dynamic Programming
    Yang, Chunyu
    Xu, Yiming
    Zhou, Linna
    Sun, Yongzheng
    COMPLEXITY, 2018,
  • [20] A Model-Free Control Strategy for Vehicle Lateral Stability With Adaptive Dynamic Programming
    Sun, Weichao
    Wang, Xin
    Zhang, Changzhu
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2020, 67 (12) : 10693 - 10701