Approximate dynamic programming with Gaussian processes

被引：19

作者：

Deisenroth, Marc P. ^{[1
,2
]}

Peters, Jan ^{[2
]}

Rasmussen, Carl E. ^{[1
,2
]}

机构：

[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England

[2] Max Plank Inst Biol Cybernet, Tubingen, Germany

来源：

2008 AMERICAN CONTROL CONFERENCE, VOLS 1-12 | 2008年

关键词：

D O I：

10.1109/ACC.2008.4587201

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In general, it is difficult to determine an optimal closed-loop policy in nonlinear control problems with continuous-valued state and control domains. Hence, approximations are often inevitable. The standard method of discretizing states and controls suffers from the curse of dimensionality and strongly depends on the chosen temporal sampling rate. In this paper, we introduce Gaussian process dynamic programming (GPDP) and determine an approximate globally optimal closed-loop policy. In GPDP, value functions in the Bellman recursion of the dynamic programming algorithm are modeled using Gaussian processes. GPDP returns an optimal state-feedback for a finite set of states. Based on these outcomes, we learn a possibly discontinuous closed-loop policy on the entire state space by switching between two independently trained Gaussian processes. A binary classifier selects one Gaussian process to predict the optimal control signal. We show that GPDP is able to yield an almost optimal solution to an LQ problem using few sample points. Moreover, we successfully apply GPDP to the underpowered pendulum swing up, a complex nonlinear control problem.

引用

页码：4480 / +

页数：2

共 50 条

[1] Approximate Dynamic Programming with Gaussian Processes for Optimal Control of Continuous-Time Nonlinear Systems
Beppu, Hirofumi
Maruta, Ichiro
Fujimoto, Kenji
IFAC PAPERSONLINE, 2020, 53 (02): : 6715 - 6722
[2] Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming
Gabriel Riutort-Mayol
Paul-Christian Bürkner
Michael R. Andersen
Arno Solin
Aki Vehtari
Statistics and Computing, 2023, 33
[3] Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming
Riutort-Mayol, Gabriel
Buerkner, Paul-Christian
Andersen, Michael R.
Solin, Arno
Vehtari, Aki
STATISTICS AND COMPUTING, 2023, 33 (01)
[4] Approximate Dynamic Programming Using Bellman Residual Elimination and Gaussian Process Regression
Bethke, Brett
How, Jonathan P.
2009 AMERICAN CONTROL CONFERENCE, VOLS 1-9, 2009, : 745 - +
[5] DYNAMIC PROGRAMMING PROCESSES WITHIN DYNAMIC PROGRAMMING PROCESSES
ROSE, CJ
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1969, 26 (03) : 669 - &
[6] Data-Driven Differential Dynamic Programming Using Gaussian Processes
Pan, Yunpeng
Theodorou, Evangelos A.
2015 AMERICAN CONTROL CONFERENCE (ACC), 2015, : 4467 - 4472
[7] Approximate Dynamic Programming Based on Gaussian Process Regression for the Perimeter Patrol Optimization Problem
Qi, Naiming
Sun, Xiaolei
Sun, Kang
Liu, Xingfu
Wu, Feng
Liu, Chao
2014 INTERNATIONAL CONFERENCE ON MECHATRONICS AND CONTROL (ICMC), 2014, : 1750 - 1754
[8] Perspectives of approximate dynamic programming
Powell, Warren B.
ANNALS OF OPERATIONS RESEARCH, 2016, 241 (1-2) : 319 - 356
[9] A Survey of Approximate Dynamic Programming
Wang Lin
Peng Hui
Zhu Hua-yong
Shen Lin-cheng
2009 INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS, VOL 2, PROCEEDINGS, 2009, : 396 - 399
[10] A LINEAR PROGRAMMING METHODOLOGY FOR APPROXIMATE DYNAMIC PROGRAMMING
Diaz, Henry
Sala, Antonio
Armesto, Leopoldo
INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2020, 30 (02) : 363 - 375

← 1 2 3 4 5 →