Approximate dynamic programming with Gaussian processes

被引:19
|
作者
Deisenroth, Marc P. [1 ,2 ]
Peters, Jan [2 ]
Rasmussen, Carl E. [1 ,2 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
[2] Max Plank Inst Biol Cybernet, Tubingen, Germany
关键词
D O I
10.1109/ACC.2008.4587201
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In general, it is difficult to determine an optimal closed-loop policy in nonlinear control problems with continuous-valued state and control domains. Hence, approximations are often inevitable. The standard method of discretizing states and controls suffers from the curse of dimensionality and strongly depends on the chosen temporal sampling rate. In this paper, we introduce Gaussian process dynamic programming (GPDP) and determine an approximate globally optimal closed-loop policy. In GPDP, value functions in the Bellman recursion of the dynamic programming algorithm are modeled using Gaussian processes. GPDP returns an optimal state-feedback for a finite set of states. Based on these outcomes, we learn a possibly discontinuous closed-loop policy on the entire state space by switching between two independently trained Gaussian processes. A binary classifier selects one Gaussian process to predict the optimal control signal. We show that GPDP is able to yield an almost optimal solution to an LQ problem using few sample points. Moreover, we successfully apply GPDP to the underpowered pendulum swing up, a complex nonlinear control problem.
引用
收藏
页码:4480 / +
页数:2
相关论文
共 50 条
  • [31] Decentralized approximate dynamic programming for dynamic networks of agents
    Lakshmanan, Hariharan
    Pucci de Farias, Daniela
    2006 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2006, 1-12 : 1648 - +
  • [32] On constraint sampling in the linear programming approach to approximate dynamic programming
    de Farias, DP
    Van Roy, B
    MATHEMATICS OF OPERATIONS RESEARCH, 2004, 29 (03) : 462 - 478
  • [33] Estimation of Dynamic Gaussian Processes
    van Hulst, Jilles
    van Zuijlen, Roy
    Antunes, Duarte
    Heemels, W. P. M. H.
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 3206 - 3211
  • [34] Using Gaussian Processes in Bayesian Robot Programming
    Aznar, Fidel
    Pujol, Francisco A.
    Pujol, Mar
    Rizo, Ramon
    DISTRIBUTED COMPUTING, ARTIFICIAL INTELLIGENCE, BIOINFORMATICS, SOFT COMPUTING, AND AMBIENT ASSISTED LIVING, PT II, PROCEEDINGS, 2009, 5518 : 547 - +
  • [35] Quantile Propagation for Wasserstein-Approximate Gaussian Processes
    Zhang, Rui
    Walder, Christian J.
    Bonilla, Edwin V.
    Rizoiu, Marian-Andrei
    Xie, Lexing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [36] Approximate inference for disease mapping with sparse Gaussian processes
    Vanhatalo, Jarno
    Pietilainen, Ville
    Vehtari, Aki
    STATISTICS IN MEDICINE, 2010, 29 (15) : 1580 - 1607
  • [37] Alleviating tuning sensitivity in Approximate Dynamic Programming
    Beuchat, Paul
    Georghiou, Angelos
    Lygeros, John
    2016 EUROPEAN CONTROL CONFERENCE (ECC), 2016, : 1616 - 1622
  • [38] Approximate dynamic programming based on expansive projections
    Arruda, Edilson R.
    do Val, Joao B. R.
    PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2006, : 5540 - +
  • [39] ADPTriage: Approximate Dynamic Programming for Bug Triage
    Jahanshahi H.
    Cevik M.
    Mousavi K.
    Basar A.
    IEEE Transactions on Software Engineering, 2023, 49 (10) : 4594 - 4609
  • [40] Approximate dynamic programming approach for process control
    Lee, Jay H.
    Wong, Weechin
    JOURNAL OF PROCESS CONTROL, 2010, 20 (09) : 1038 - 1048