Approximate dynamic programming with a fuzzy parameterization

被引:48
|
作者
Busoniu, Lucian [1 ]
Ernst, Damien [2 ]
De Schutter, Bart [1 ]
Babuska, Robert [1 ]
机构
[1] Delft Univ Technol, Delft Ctr Syst &Control, NL-2628 CD Delft, Netherlands
[2] Univ Liege, Inst Montefiore, FNRS, B-4000 Liege, Belgium
关键词
Approximate dynamic programming; Fuzzy approximation; Value iteration; Convergence analysis; ALGORITHM;
D O I
10.1016/j.automatica.2010.02.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dynamic programming (DP) is a powerful paradigm for general, nonlinear optimal control. Computing exact DP solutions is in general only possible when the process states and the control actions take values in a small discrete set. In practice, it is necessary to approximate the solutions. Therefore, we propose an algorithm for approximate DP that relies on a fuzzy partition of the state space, and on a discretization of the action space. This fuzzy Q-iteration algorithm works for deterministic processes, under the discounted return criterion. We prove that fuzzy Q-iteration asymptotically converges to a solution that lies within a bound of the optimal solution. A bound on the suboptimality of the solution obtained in a finite number of iterations is also derived. Under continuity assumptions on the dynamics and on the reward function, we show that fuzzy Q-iteration is consistent, i.e., that it asymptotically obtains the optimal solution as the approximation accuracy increases. These properties hold both when the parameters of the approximator are updated in a synchronous fashion, and when they are updated asynchronously. The asynchronous algorithm is proven to converge at least as fast as the synchronous one. The performance of fuzzy Q-iteration is illustrated in a two-link manipulator control problem. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:804 / 814
页数:11
相关论文
共 50 条
  • [11] Approximate dynamic programming for stochastic reachability
    Kariotoglou, Nikolaos
    Summers, Sean
    Summers, Tyler
    Kamgarpour, Maryam
    Lygeros, John
    2013 EUROPEAN CONTROL CONFERENCE (ECC), 2013, : 584 - 589
  • [12] Approximate dynamic programming for container stacking
    Boschma, Rene
    Mes, Martijn R. K.
    de Vries, Leon R.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 310 (01) : 328 - 342
  • [13] Feature Discovery in Approximate Dynamic Programming
    Preux, Philippe
    Girgin, Sertan
    Loth, Manuel
    ADPRL: 2009 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2009, : 109 - +
  • [14] Bayesian Exploration for Approximate Dynamic Programming
    Ryzhov, Ilya O.
    Mes, Martijn R. K.
    Powell, Warren B.
    van den Berg, Gerald
    OPERATIONS RESEARCH, 2019, 67 (01) : 198 - 214
  • [15] On approximate dynamic programming in switching systems
    Rantzer, Anders
    2005 44TH IEEE CONFERENCE ON DECISION AND CONTROL & EUROPEAN CONTROL CONFERENCE, VOLS 1-8, 2005, : 1391 - 1396
  • [16] Approximate Dynamic Programming for Ambulance Redeployment
    Maxwell, Matthew S.
    Restrepo, Mateo
    Henderson, Shane G.
    Topaloglu, Huseyin
    INFORMS JOURNAL ON COMPUTING, 2010, 22 (02) : 266 - 281
  • [17] An Approximate Dynamic Programming Approach to Dynamic Stochastic Matching
    You, Fan
    Vossen, Thomas
    INFORMS JOURNAL ON COMPUTING, 2024, 36 (04) : 1006 - 1022
  • [18] Decentralized approximate dynamic programming for dynamic networks of agents
    Lakshmanan, Hariharan
    Pucci de Farias, Daniela
    2006 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2006, 1-12 : 1648 - +
  • [19] The fuzzy dynamic programming problems
    Nguyen Dinh Phu
    Phan Van Tri
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 30 (03) : 1663 - 1674
  • [20] On constraint sampling in the linear programming approach to approximate dynamic programming
    de Farias, DP
    Van Roy, B
    MATHEMATICS OF OPERATIONS RESEARCH, 2004, 29 (03) : 462 - 478