Approximate Dynamic Programming via a Smoothed Linear Program

被引:34
|
作者
Desai, Vijay V. [1 ]
Farias, Vivek F. [2 ]
Moallemi, Ciamac C. [3 ]
机构
[1] Columbia Univ, Dept Ind Engn & Operat Res, New York, NY 10027 USA
[2] MIT, Sloan Sch Management, Cambridge, MA 02139 USA
[3] Columbia Univ, Grad Sch Business, New York, NY 10027 USA
关键词
CONVERGENCE; POLICIES;
D O I
10.1287/opre.1120.1044
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
We present a novel linear program for the approximation of the dynamic programming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural "projection" of a well-studied linear program for exact dynamic programming. Such programs restrict attention to approximations that are lower bounds to the optimal cost-to-go function. Our program-the "smoothed approximate linear program"-is distinct from such approaches and relaxes the restriction to lower bounding approximations in an appropriate fashion while remaining computationally tractable. Doing so appears to have several advantages: First, we demonstrate bounds on the quality of approximation to the optimal cost-to-go function afforded by our approach. These bounds are, in general, no worse than those available for extant LP approaches and for specific problem instances can be shown to be arbitrarily stronger. Second, experiments with our approach on a pair of challenging problems (the game of Tetris and a queueing network control problem) show that the approach outperforms the existing LP approach (which has previously been shown to be competitive with several ADP algorithms) by a substantial margin.
引用
收藏
页码:655 / 674
页数:20
相关论文
共 50 条
  • [41] A NOTE ON APPROXIMATE LINEAR-PROGRAMMING
    MEGIDDO, N
    INFORMATION PROCESSING LETTERS, 1992, 42 (01) : 53 - 53
  • [42] An Algorithm for Approximate Multiparametric Linear Programming
    C. Filippi
    Journal of Optimization Theory and Applications, 2004, 120 : 73 - 95
  • [43] Robust smoothed analysis of a condition number for linear programming
    Buergisser, Peter
    Amelunxen, Dennis
    MATHEMATICAL PROGRAMMING, 2012, 131 (1-2) : 221 - 251
  • [44] Optimal Self-Triggering for Nonlinear Systems via Approximate Dynamic Programming
    Tolic, Domagoj
    Fierro, Rafael
    Ferrari, Silvia
    2012 IEEE INTERNATIONAL CONFERENCE ON CONTROL APPLICATIONS (CCA), 2012, : 879 - 884
  • [45] Approximate dynamic programming via direct search in the space of value function approximations
    Arruda, E. F.
    Fragoso, M. D.
    do Val, J. B. R.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2011, 211 (02) : 343 - 351
  • [46] Near-optimal Control of Motor Drives via Approximate Dynamic Programming
    Wang, Yebin
    Chakrabarty, Ankush
    Zhou, Meng-Chu
    Zhang, Jinyun
    2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 3679 - 3686
  • [47] Symbolic Dynamic Programming for Continuous State MDPs with Linear Program Transitions
    Jeong, Jihwan
    Jaggi, Parth
    Sanner, Scott
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4083 - 4089
  • [48] Auxiliary functions in dynamic programming for smoothed road detection
    Merlet, N
    Zerubia, J
    VISION SYSTEMS: NEW IMAGE PROCESSING TECHNIQUES, 1996, 2785 : 204 - 212
  • [49] Approximate dynamic programming with Gaussian processes
    Deisenroth, Marc P.
    Peters, Jan
    Rasmussen, Carl E.
    2008 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2008, : 4480 - +
  • [50] Approximate dynamic programming for sensor management
    Castanon, DA
    PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 1997, : 1202 - 1207