Approximate dynamic programming with a fuzzy parameterization

被引：48

作者：

Busoniu, Lucian ^{[1
]}

Ernst, Damien ^{[2
]}

De Schutter, Bart ^{[1
]}

Babuska, Robert ^{[1
]}

机构：

[1] Delft Univ Technol, Delft Ctr Syst &Control, NL-2628 CD Delft, Netherlands

[2] Univ Liege, Inst Montefiore, FNRS, B-4000 Liege, Belgium

来源：

AUTOMATICA | 2010年 / 46卷 / 05期

关键词：

Approximate dynamic programming; Fuzzy approximation; Value iteration; Convergence analysis; ALGORITHM;

D O I：

10.1016/j.automatica.2010.02.006

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Dynamic programming (DP) is a powerful paradigm for general, nonlinear optimal control. Computing exact DP solutions is in general only possible when the process states and the control actions take values in a small discrete set. In practice, it is necessary to approximate the solutions. Therefore, we propose an algorithm for approximate DP that relies on a fuzzy partition of the state space, and on a discretization of the action space. This fuzzy Q-iteration algorithm works for deterministic processes, under the discounted return criterion. We prove that fuzzy Q-iteration asymptotically converges to a solution that lies within a bound of the optimal solution. A bound on the suboptimality of the solution obtained in a finite number of iterations is also derived. Under continuity assumptions on the dynamics and on the reward function, we show that fuzzy Q-iteration is consistent, i.e., that it asymptotically obtains the optimal solution as the approximation accuracy increases. These properties hold both when the parameters of the approximator are updated in a synchronous fashion, and when they are updated asynchronously. The asynchronous algorithm is proven to converge at least as fast as the synchronous one. The performance of fuzzy Q-iteration is illustrated in a two-link manipulator control problem. (C) 2010 Elsevier Ltd. All rights reserved.

引用

页码：804 / 814

页数：11

共 50 条

[31] Intelligent Questionnaires Using Approximate Dynamic Programming
Logé F.
Le Pennec E.
Amadou-Boubacar H.
i-com, 2021, 19 (03) : 227 - 237
[32] Approximate Dynamic Programming via Penalty Functions
Beuchat, Paul N.
Lygeros, John
IFAC PAPERSONLINE, 2017, 50 (01): : 11814 - 11821
[33] An approximate dynamic programming approach for collaborative caching
Yang, Xinan
Thomos, Nikolaos
ENGINEERING OPTIMIZATION, 2021, 53 (06) : 1005 - 1023
[34] ON CONVERGENCE OF APPROXIMATE SOLUTIONS OF A DYNAMIC PROGRAMMING EQUATION
JANKOWSK.T
COLLOQUIUM MATHEMATICUM, 1970, 21 (01) : 149 - &
[35] Approximate dynamic programming for ship course control
Bai, Xuerui
Yi, Jianqiang
Zhao, Dongbin
ADVANCES IN NEURAL NETWORKS - ISNN 2007, PT 1, PROCEEDINGS, 2007, 4491 : 349 - +
[36] AMBULANCE REDEPLOYMENT: AN APPROXIMATE DYNAMIC PROGRAMMING APPROACH
Maxwell, Matthew S.
Henderson, Shane G.
Topaloglu, Huseyin
PROCEEDINGS OF THE 2009 WINTER SIMULATION CONFERENCE (WSC 2009 ), VOL 1-4, 2009, : 1801 - 1811
[37] Empirical Value Iteration for Approximate Dynamic Programming
Haskell, William B.
Jain, Rahul
Kalathil, Dileep
2014 AMERICAN CONTROL CONFERENCE (ACC), 2014, : 495 - 500
[38] Efficient sampling in approximate dynamic programming algorithms
Cervellera, Cristiano
Muselli, Marco
COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2007, 38 (03) : 417 - 443
[39] Inpatient Overflow: An Approximate Dynamic Programming Approach
Dai, J. G.
Shi, Pengyi
M&SOM-MANUFACTURING & SERVICE OPERATIONS MANAGEMENT, 2019, 21 (04) : 894 - 911
[40] APPROXIMATE DYNAMIC PROGRAMMING: LESSONS FROM THE FIELD
Powell, Warren B.
2008 WINTER SIMULATION CONFERENCE, VOLS 1-5, 2008, : 205 - 214

← 1 2 3 4 5 →