Approximate dynamic programming via direct search in the space of value function approximations

被引:10
|
作者
Arruda, E. F. [1 ]
Fragoso, M. D. [2 ]
do Val, J. B. R. [3 ]
机构
[1] FENG PUCRS, BR-90619900 Porto Alegre, RS, Brazil
[2] CSC LNCC, BR-25651075 Petropolis, RJ, Brazil
[3] DT FEEC UNICAMP, BR-13083852 Campinas, SP, Brazil
关键词
Dynamic programming; Markov decision processes; Convex optimization; Direct search methods; CONVERGENCE;
D O I
10.1016/j.ejor.2010.11.019
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic programming (DP) problems. For a fixed control policy, the span semi-norm of the so-called Bellman residual is shown to be convex in the Banach space of candidate solutions to the DP problem. This fact motivates the introduction of an AVI algorithm with local search that seeks to minimize the span semi-norm of the Bellman residual in a convex value function approximation space. The novelty here is that the optimality of a point in the approximation architecture is characterized by means of convex optimization concepts and necessary and sufficient conditions to local optimality are derived. The procedure employs the classical AVI algorithm direction (Bellman residual) combined with a set of independent search directions, to improve the convergence rate. It has guaranteed convergence and satisfies, at least, the necessary optimality conditions over a prescribed set of directions. To illustrate the method, examples are presented that deal with a class of problems from the literature and a large state space queueing problem setting. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:343 / 351
页数:9
相关论文
共 50 条
  • [31] Approximate dynamic programming for management of high-value spare parts
    Simao, Hugo
    Powell, Warren
    JOURNAL OF MANUFACTURING TECHNOLOGY MANAGEMENT, 2009, 20 (02) : 147 - 160
  • [32] GENERATOR MAINTENANCE SCHEDULING VIA SUCCESSIVE APPROXIMATIONS DYNAMIC-PROGRAMMING
    ZURN, HH
    QUINTANA, VH
    IEEE TRANSACTIONS ON POWER APPARATUS AND SYSTEMS, 1975, PA94 (02): : 665 - 671
  • [33] Approximate dynamic programming for the aeromedical evacuation dispatching problem: Value function approximation utilizing multiple level aggregation
    Robbins, Matthew J.
    Jenkins, Phillip R.
    Bastian, Nathaniel D.
    Lunday, Brian J.
    OMEGA-INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE, 2020, 91
  • [34] Dynamic programming based optimized product quantization for approximate nearest neighbor search
    Cai, Yuanzheng
    Ji, Rongrong
    Li, Shaozi
    NEUROCOMPUTING, 2016, 217 : 110 - 118
  • [35] Simulation Analysis for UAV Search Algorithm Design Using Approximate Dynamic Programming
    Flint, Matthew
    Fernandez, Emmanuel
    Kelton, W. David
    MILITARY OPERATIONS RESEARCH, 2009, 14 (02) : 41 - 50
  • [36] Cooperative Navigation for Heterogeneous Autonomous Vehicles via Approximate Dynamic Programming
    Ferrari, Silvia
    Anderson, Michael
    Fierro, Rafael
    Lu, Wenjie
    2011 50TH IEEE CONFERENCE ON DECISION AND CONTROL AND EUROPEAN CONTROL CONFERENCE (CDC-ECC), 2011, : 121 - 127
  • [37] Optimized ensemble value function approximation for dynamic programming
    Cervellera, Cristiano
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 309 (02) : 719 - 730
  • [38] Efficient dynamic programming for high-dimensional, optimal motion planning by spectral learning of approximate value function symmetries
    Vernaza, Paul
    Lee, Daniel D.
    2011 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2011,
  • [39] MurTree: Optimal Decision Trees via Dynamic Programming and Search
    Demirović, Emir
    Lukina, Anna
    Hebrard, Emmanuel
    Chan, Jeffrey
    Bailey, James
    Leckie, Christopher
    Ramamohanarao, Kotagiri
    Stuckey, Peter J.
    Journal of Machine Learning Research, 2022, 23
  • [40] MurTree: Optimal Decision Trees via Dynamic Programming and Search
    Demirovic, Emir
    Lukina, Anna
    Hebrard, Emmanuel
    Chan, Jeffrey
    Bailey, James
    Leckie, Christopher
    Ramamohanarao, Kotagiri
    Stuckey, Peter J.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23