Efficient sampling in approximate dynamic programming algorithms

被引:21
|
作者
Cervellera, Cristiano [1 ]
Muselli, Marco [1 ]
机构
[1] Ist Studi Sistemi Intelligenti Lautomaz, Consiglio Nazl Ric, I-16149 Genoa, Italy
关键词
stochastic optimal control problem; dynamic programming; sample complexity; deterministic learning; low-discrepancy sequences;
D O I
10.1007/s10589-007-9054-8
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Dynamic Programming (DP) is known to be a standard optimization tool for solving Stochastic Optimal Control (SOC) problems, either over a finite or an infinite horizon of stages. Under very general assumptions, commonly employed numerical algorithms are based on approximations of the cost-to-go functions, by means of suitable parametric models built from a set of sampling points in the d-dimensional state space. Here the problem of sample complexity, i.e., how "fast" the number of points must grow with the input dimension in order to have an accurate estimate of the cost-to-go functions in typical DP approaches such as value iteration and policy iteration, is discussed. It is shown that a choice of the sampling based on low-discrepancy sequences, commonly used for efficient numerical integration, permits to achieve, under suitable hypotheses, an almost linear sample complexity, thus contributing to mitigate the curse of dimensionality of the approximate DP procedure.
引用
收藏
页码:417 / 443
页数:27
相关论文
共 50 条
  • [41] Approximate dynamic programming for sensor management
    Castanon, DA
    PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 1997, : 1202 - 1207
  • [42] Dynamic Programming for Approximate Expansion Algorithm
    Veksler, Olga
    COMPUTER VISION - ECCV 2012, PT III, 2012, 7574 : 850 - 863
  • [43] Approximate dynamic programming for stochastic reachability
    Kariotoglou, Nikolaos
    Summers, Sean
    Summers, Tyler
    Kamgarpour, Maryam
    Lygeros, John
    2013 EUROPEAN CONTROL CONFERENCE (ECC), 2013, : 584 - 589
  • [44] Approximate dynamic programming for container stacking
    Boschma, Rene
    Mes, Martijn R. K.
    de Vries, Leon R.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 310 (01) : 328 - 342
  • [45] Feature Discovery in Approximate Dynamic Programming
    Preux, Philippe
    Girgin, Sertan
    Loth, Manuel
    ADPRL: 2009 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2009, : 109 - +
  • [46] Approximate dynamic programming with a fuzzy parameterization
    Busoniu, Lucian
    Ernst, Damien
    De Schutter, Bart
    Babuska, Robert
    AUTOMATICA, 2010, 46 (05) : 804 - 814
  • [47] Bayesian Exploration for Approximate Dynamic Programming
    Ryzhov, Ilya O.
    Mes, Martijn R. K.
    Powell, Warren B.
    van den Berg, Gerald
    OPERATIONS RESEARCH, 2019, 67 (01) : 198 - 214
  • [48] On approximate dynamic programming in switching systems
    Rantzer, Anders
    2005 44TH IEEE CONFERENCE ON DECISION AND CONTROL & EUROPEAN CONTROL CONFERENCE, VOLS 1-8, 2005, : 1391 - 1396
  • [49] Approximate Dynamic Programming for Ambulance Redeployment
    Maxwell, Matthew S.
    Restrepo, Mateo
    Henderson, Shane G.
    Topaloglu, Huseyin
    INFORMS JOURNAL ON COMPUTING, 2010, 22 (02) : 266 - 281
  • [50] Single-pass and approximate dynamic-programming algorithms for order acceptance and capacity planning
    Herbots, Jade
    Herroelen, Willy
    Leus, Roel
    JOURNAL OF HEURISTICS, 2010, 16 (02) : 189 - 209