Approximate dynamic programming with a fuzzy parameterization

被引:48
|
作者
Busoniu, Lucian [1 ]
Ernst, Damien [2 ]
De Schutter, Bart [1 ]
Babuska, Robert [1 ]
机构
[1] Delft Univ Technol, Delft Ctr Syst &Control, NL-2628 CD Delft, Netherlands
[2] Univ Liege, Inst Montefiore, FNRS, B-4000 Liege, Belgium
关键词
Approximate dynamic programming; Fuzzy approximation; Value iteration; Convergence analysis; ALGORITHM;
D O I
10.1016/j.automatica.2010.02.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dynamic programming (DP) is a powerful paradigm for general, nonlinear optimal control. Computing exact DP solutions is in general only possible when the process states and the control actions take values in a small discrete set. In practice, it is necessary to approximate the solutions. Therefore, we propose an algorithm for approximate DP that relies on a fuzzy partition of the state space, and on a discretization of the action space. This fuzzy Q-iteration algorithm works for deterministic processes, under the discounted return criterion. We prove that fuzzy Q-iteration asymptotically converges to a solution that lies within a bound of the optimal solution. A bound on the suboptimality of the solution obtained in a finite number of iterations is also derived. Under continuity assumptions on the dynamics and on the reward function, we show that fuzzy Q-iteration is consistent, i.e., that it asymptotically obtains the optimal solution as the approximation accuracy increases. These properties hold both when the parameters of the approximator are updated in a synchronous fashion, and when they are updated asynchronously. The asynchronous algorithm is proven to converge at least as fast as the synchronous one. The performance of fuzzy Q-iteration is illustrated in a two-link manipulator control problem. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:804 / 814
页数:11
相关论文
共 50 条
  • [1] Adaptive critic based approximate dynamic programming for tuning fuzzy controllers
    Shannon, TT
    Lendaris, GG
    NINTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2000), VOLS 1 AND 2, 2000, : 25 - 29
  • [2] Perspectives of approximate dynamic programming
    Powell, Warren B.
    ANNALS OF OPERATIONS RESEARCH, 2016, 241 (1-2) : 319 - 356
  • [3] A Survey of Approximate Dynamic Programming
    Wang Lin
    Peng Hui
    Zhu Hua-yong
    Shen Lin-cheng
    2009 INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS, VOL 2, PROCEEDINGS, 2009, : 396 - 399
  • [4] A LINEAR PROGRAMMING METHODOLOGY FOR APPROXIMATE DYNAMIC PROGRAMMING
    Diaz, Henry
    Sala, Antonio
    Armesto, Leopoldo
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2020, 30 (02) : 363 - 375
  • [5] The linear programming approach to approximate dynamic programming
    De Farias, DP
    Van Roy, B
    OPERATIONS RESEARCH, 2003, 51 (06) : 850 - 865
  • [6] Approximate dynamic programming via linear programming
    de Farias, DP
    Van Roy, B
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 689 - 695
  • [7] Approximate Dynamic Programming via Sum of Squares Programming
    Summers, Tyler H.
    Kunz, Konstantin
    Kariotoglou, Nikolaos
    Kamgarpour, Maryam
    Summers, Sean
    Lygeros, John
    2013 EUROPEAN CONTROL CONFERENCE (ECC), 2013, : 191 - 197
  • [8] Approximate dynamic programming with Gaussian processes
    Deisenroth, Marc P.
    Peters, Jan
    Rasmussen, Carl E.
    2008 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2008, : 4480 - +
  • [9] Approximate dynamic programming for sensor management
    Castanon, DA
    PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 1997, : 1202 - 1207
  • [10] Dynamic Programming for Approximate Expansion Algorithm
    Veksler, Olga
    COMPUTER VISION - ECCV 2012, PT III, 2012, 7574 : 850 - 863