Approximate dynamic programming via direct search in the space of value function approximations

被引:10
|
作者
Arruda, E. F. [1 ]
Fragoso, M. D. [2 ]
do Val, J. B. R. [3 ]
机构
[1] FENG PUCRS, BR-90619900 Porto Alegre, RS, Brazil
[2] CSC LNCC, BR-25651075 Petropolis, RJ, Brazil
[3] DT FEEC UNICAMP, BR-13083852 Campinas, SP, Brazil
关键词
Dynamic programming; Markov decision processes; Convex optimization; Direct search methods; CONVERGENCE;
D O I
10.1016/j.ejor.2010.11.019
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic programming (DP) problems. For a fixed control policy, the span semi-norm of the so-called Bellman residual is shown to be convex in the Banach space of candidate solutions to the DP problem. This fact motivates the introduction of an AVI algorithm with local search that seeks to minimize the span semi-norm of the Bellman residual in a convex value function approximation space. The novelty here is that the optimality of a point in the approximation architecture is characterized by means of convex optimization concepts and necessary and sufficient conditions to local optimality are derived. The procedure employs the classical AVI algorithm direction (Bellman residual) combined with a set of independent search directions, to improve the convergence rate. It has guaranteed convergence and satisfies, at least, the necessary optimality conditions over a prescribed set of directions. To illustrate the method, examples are presented that deal with a class of problems from the literature and a large state space queueing problem setting. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:343 / 351
页数:9
相关论文
共 50 条
  • [41] Optimal Self-Triggering for Nonlinear Systems via Approximate Dynamic Programming
    Tolic, Domagoj
    Fierro, Rafael
    Ferrari, Silvia
    2012 IEEE INTERNATIONAL CONFERENCE ON CONTROL APPLICATIONS (CCA), 2012, : 879 - 884
  • [42] Near-optimal Control of Motor Drives via Approximate Dynamic Programming
    Wang, Yebin
    Chakrabarty, Ankush
    Zhou, Meng-Chu
    Zhang, Jinyun
    2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 3679 - 3686
  • [43] Unified aeroelastic and flight dynamic formulation via rational function approximations
    Baldelli, DH
    Chen, PC
    Panza, J
    JOURNAL OF AIRCRAFT, 2006, 43 (03): : 763 - 772
  • [44] ERROR ANALYSIS FOR POD APPROXIMATIONS OF INFINITE HORIZON PROBLEMS VIA THE DYNAMIC PROGRAMMING APPROACH
    Alla, A.
    Falcone, M.
    Volkwein, S.
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2017, 55 (05) : 3091 - 3115
  • [45] Unified aeroelastic and flight dynamic formulation via rational function approximations
    Baldelli, Dario H.
    Chen, P.C.
    Panza, Jose
    Journal of Aircraft, 1600, 43 (03): : 763 - 772
  • [46] Nonconvex robust programming via value-function optimization
    Cui, Ying
    He, Ziyu
    Pang, Jong-Shi
    COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2021, 78 (02) : 411 - 450
  • [47] Nonconvex robust programming via value-function optimization
    Ying Cui
    Ziyu He
    Jong-Shi Pang
    Computational Optimization and Applications, 2021, 78 : 411 - 450
  • [48] OPTIMAL FLEET COMPOSITION VIA DYNAMIC PROGRAMMING AND GOLDEN SECTION SEARCH
    Loxton, Ryan
    Lin, Qun
    JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2011, 7 (04) : 875 - 890
  • [49] Robust Nonlinear Model Predictive Control via Approximate Value Function
    Yang, Yu
    Lee, Jong Min
    2011 11TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2011, : 1816 - 1821
  • [50] Approximate Optimal tracking Control for Nonlinear Discrete-time Switched Systems via Approximate Dynamic Programming
    Qin, Chunbin
    Huang, Yizhe
    Yang, Yabin
    Zhang, Jishi
    Liu, Xianxing
    PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 1456 - 1461