Approximate Dynamic Programming via a Smoothed Linear Program

被引:34
|
作者
Desai, Vijay V. [1 ]
Farias, Vivek F. [2 ]
Moallemi, Ciamac C. [3 ]
机构
[1] Columbia Univ, Dept Ind Engn & Operat Res, New York, NY 10027 USA
[2] MIT, Sloan Sch Management, Cambridge, MA 02139 USA
[3] Columbia Univ, Grad Sch Business, New York, NY 10027 USA
关键词
CONVERGENCE; POLICIES;
D O I
10.1287/opre.1120.1044
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
We present a novel linear program for the approximation of the dynamic programming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural "projection" of a well-studied linear program for exact dynamic programming. Such programs restrict attention to approximations that are lower bounds to the optimal cost-to-go function. Our program-the "smoothed approximate linear program"-is distinct from such approaches and relaxes the restriction to lower bounding approximations in an appropriate fashion while remaining computationally tractable. Doing so appears to have several advantages: First, we demonstrate bounds on the quality of approximation to the optimal cost-to-go function afforded by our approach. These bounds are, in general, no worse than those available for extant LP approaches and for specific problem instances can be shown to be arbitrarily stronger. Second, experiments with our approach on a pair of challenging problems (the game of Tetris and a queueing network control problem) show that the approach outperforms the existing LP approach (which has previously been shown to be competitive with several ADP algorithms) by a substantial margin.
引用
收藏
页码:655 / 674
页数:20
相关论文
共 50 条
  • [1] Approximate dynamic programming via linear programming
    de Farias, DP
    Van Roy, B
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 689 - 695
  • [2] A LINEAR PROGRAMMING METHODOLOGY FOR APPROXIMATE DYNAMIC PROGRAMMING
    Diaz, Henry
    Sala, Antonio
    Armesto, Leopoldo
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2020, 30 (02) : 363 - 375
  • [3] The linear programming approach to approximate dynamic programming
    De Farias, DP
    Van Roy, B
    OPERATIONS RESEARCH, 2003, 51 (06) : 850 - 865
  • [4] Approximate Dynamic Programming via Sum of Squares Programming
    Summers, Tyler H.
    Kunz, Konstantin
    Kariotoglou, Nikolaos
    Kamgarpour, Maryam
    Summers, Sean
    Lygeros, John
    2013 EUROPEAN CONTROL CONFERENCE (ECC), 2013, : 191 - 197
  • [5] On constraint sampling in the linear programming approach to approximate dynamic programming
    de Farias, DP
    Van Roy, B
    MATHEMATICS OF OPERATIONS RESEARCH, 2004, 29 (03) : 462 - 478
  • [6] Markdown Optimization via Approximate Dynamic Programming
    Cosgun, Ozlem
    Kula, Ufuk
    Kahraman, Cengiz
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2013, 6 (01) : 64 - 78
  • [7] Approximate Dynamic Programming via Penalty Functions
    Beuchat, Paul N.
    Lygeros, John
    IFAC PAPERSONLINE, 2017, 50 (01): : 11814 - 11821
  • [8] Markdown Optimization via Approximate Dynamic Programming
    Özlem Coşgun
    Ufuk Kula
    Cengiz Kahraman
    International Journal of Computational Intelligence Systems, 2013, 6 : 64 - 78
  • [9] State Aggregation based Linear Programming approach to Approximate Dynamic Programming
    Darbha, S.
    Krishnamoorthy, K.
    Pachter, M.
    Chandler, P.
    49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 935 - 941
  • [10] Data-driven approximate dynamic programming: A linear programming approach
    Sutter, Tobias
    Kamoutsi, Angeliki
    Esfahani, Peyman Mohajerin
    Lygeros, John
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,