Approximate dynamic programming with a fuzzy parameterization

被引:48
|
作者
Busoniu, Lucian [1 ]
Ernst, Damien [2 ]
De Schutter, Bart [1 ]
Babuska, Robert [1 ]
机构
[1] Delft Univ Technol, Delft Ctr Syst &Control, NL-2628 CD Delft, Netherlands
[2] Univ Liege, Inst Montefiore, FNRS, B-4000 Liege, Belgium
关键词
Approximate dynamic programming; Fuzzy approximation; Value iteration; Convergence analysis; ALGORITHM;
D O I
10.1016/j.automatica.2010.02.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dynamic programming (DP) is a powerful paradigm for general, nonlinear optimal control. Computing exact DP solutions is in general only possible when the process states and the control actions take values in a small discrete set. In practice, it is necessary to approximate the solutions. Therefore, we propose an algorithm for approximate DP that relies on a fuzzy partition of the state space, and on a discretization of the action space. This fuzzy Q-iteration algorithm works for deterministic processes, under the discounted return criterion. We prove that fuzzy Q-iteration asymptotically converges to a solution that lies within a bound of the optimal solution. A bound on the suboptimality of the solution obtained in a finite number of iterations is also derived. Under continuity assumptions on the dynamics and on the reward function, we show that fuzzy Q-iteration is consistent, i.e., that it asymptotically obtains the optimal solution as the approximation accuracy increases. These properties hold both when the parameters of the approximator are updated in a synchronous fashion, and when they are updated asynchronously. The asynchronous algorithm is proven to converge at least as fast as the synchronous one. The performance of fuzzy Q-iteration is illustrated in a two-link manipulator control problem. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:804 / 814
页数:11
相关论文
共 50 条
  • [21] Approximate equality of fuzzy numbers and its application to fuzzy linear programming
    Yang, Lixing
    PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATION AND MANAGEMENT SCIENCES, 2006, 5 : 626 - 633
  • [22] Alleviating tuning sensitivity in Approximate Dynamic Programming
    Beuchat, Paul
    Georghiou, Angelos
    Lygeros, John
    2016 EUROPEAN CONTROL CONFERENCE (ECC), 2016, : 1616 - 1622
  • [23] Approximate dynamic programming based on expansive projections
    Arruda, Edilson R.
    do Val, Joao B. R.
    PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2006, : 5540 - +
  • [24] ADPTriage: Approximate Dynamic Programming for Bug Triage
    Jahanshahi H.
    Cevik M.
    Mousavi K.
    Basar A.
    IEEE Transactions on Software Engineering, 2023, 49 (10) : 4594 - 4609
  • [25] Approximate dynamic programming approach for process control
    Lee, Jay H.
    Wong, Weechin
    JOURNAL OF PROCESS CONTROL, 2010, 20 (09) : 1038 - 1048
  • [26] Approximate Dynamic Programming of Continuous Annealing process
    Zhang, Yingwei
    Guo, Chao
    Chen, Xue
    Teng, Yongdong
    2009 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND LOGISTICS ( ICAL 2009), VOLS 1-3, 2009, : 353 - 358
  • [27] Markdown Optimization via Approximate Dynamic Programming
    Cosgun, Ozlem
    Kula, Ufuk
    Kahraman, Cengiz
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2013, 6 (01) : 64 - 78
  • [28] The RBF neural network in approximate dynamic programming
    Ster, B
    Dobnikar, A
    ARTIFICIAL NEURAL NETS AND GENETIC ALGORITHMS, 1999, : 161 - 165
  • [29] Empirical Policy Iteration for Approximate Dynamic Programming
    Haskell, William B.
    Jain, Rahul
    Kalathil, Dileep
    2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, : 6573 - 6578
  • [30] Sampled fictitious play for approximate dynamic programming
    Epelman, Marina
    Ghate, Archis
    Smith, Robert L.
    COMPUTERS & OPERATIONS RESEARCH, 2011, 38 (12) : 1705 - 1718