Efficient sampling in approximate dynamic programming algorithms

被引:21
|
作者
Cervellera, Cristiano [1 ]
Muselli, Marco [1 ]
机构
[1] Ist Studi Sistemi Intelligenti Lautomaz, Consiglio Nazl Ric, I-16149 Genoa, Italy
关键词
stochastic optimal control problem; dynamic programming; sample complexity; deterministic learning; low-discrepancy sequences;
D O I
10.1007/s10589-007-9054-8
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Dynamic Programming (DP) is known to be a standard optimization tool for solving Stochastic Optimal Control (SOC) problems, either over a finite or an infinite horizon of stages. Under very general assumptions, commonly employed numerical algorithms are based on approximations of the cost-to-go functions, by means of suitable parametric models built from a set of sampling points in the d-dimensional state space. Here the problem of sample complexity, i.e., how "fast" the number of points must grow with the input dimension in order to have an accurate estimate of the cost-to-go functions in typical DP approaches such as value iteration and policy iteration, is discussed. It is shown that a choice of the sampling based on low-discrepancy sequences, commonly used for efficient numerical integration, permits to achieve, under suitable hypotheses, an almost linear sample complexity, thus contributing to mitigate the curse of dimensionality of the approximate DP procedure.
引用
收藏
页码:417 / 443
页数:27
相关论文
共 50 条
  • [1] Efficient sampling in approximate dynamic programming algorithms
    Cristiano Cervellera
    Marco Muselli
    Computational Optimization and Applications, 2007, 38 : 417 - 443
  • [2] F-Discrepancy for Efficient Sampling in Approximate Dynamic Programming
    Cervellera, Cristiano
    Maccio, Danilo
    IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (07) : 1628 - 1639
  • [3] On constraint sampling in the linear programming approach to approximate dynamic programming
    de Farias, DP
    Van Roy, B
    MATHEMATICS OF OPERATIONS RESEARCH, 2004, 29 (03) : 462 - 478
  • [4] Quasi-Random Sampling for Approximate Dynamic Programming
    Cervellera, Cristiano
    Gaggero, Mauro
    Maccio, Danilo
    Marcialis, Roberto
    2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
  • [5] Efficient Sampling Algorithms for Approximate Temporal Motif Counting
    Wang, Jingjing
    Wang, Yanhao
    Jiang, Wenjun
    Li, Yuchen
    Tan, Kian-Lee
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1505 - 1514
  • [6] Lattice point sets for state sampling in approximate dynamic programming
    Cervellera, Cristiano
    Gaggero, Mauro
    Maccio, Danilo
    OPTIMAL CONTROL APPLICATIONS & METHODS, 2017, 38 (06): : 1193 - 1207
  • [7] ON THE CONVERGENCE OF SAMPLING ALGORITHMS FOR SOWING DYNAMIC STOCHASTIC PROGRAMMING
    CHEN Zhiping (Faculty of Science
    SystemsScienceandMathematicalSciences, 2000, (04) : 397 - 406
  • [8] Sequential Importance Sampling Algorithms for Dynamic Stochastic Programming
    M. A. H. Dempster
    Journal of Mathematical Sciences, 2006, 133 (4) : 1422 - 1444
  • [9] Efficient Approximate Algorithms for Empirical Variance with Hashed Block Sampling
    Chen, Xingguang
    Zhang, Fangyuan
    Wang, Sibo
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 157 - 167
  • [10] Low-discrepancy sampling for approximate dynamic programming with local approximators
    Cervellera, C.
    Gaggero, M.
    Maccio, D.
    COMPUTERS & OPERATIONS RESEARCH, 2014, 43 : 108 - 115