Approximate dynamic programming with Gaussian processes

被引:19
|
作者
Deisenroth, Marc P. [1 ,2 ]
Peters, Jan [2 ]
Rasmussen, Carl E. [1 ,2 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
[2] Max Plank Inst Biol Cybernet, Tubingen, Germany
关键词
D O I
10.1109/ACC.2008.4587201
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In general, it is difficult to determine an optimal closed-loop policy in nonlinear control problems with continuous-valued state and control domains. Hence, approximations are often inevitable. The standard method of discretizing states and controls suffers from the curse of dimensionality and strongly depends on the chosen temporal sampling rate. In this paper, we introduce Gaussian process dynamic programming (GPDP) and determine an approximate globally optimal closed-loop policy. In GPDP, value functions in the Bellman recursion of the dynamic programming algorithm are modeled using Gaussian processes. GPDP returns an optimal state-feedback for a finite set of states. Based on these outcomes, we learn a possibly discontinuous closed-loop policy on the entire state space by switching between two independently trained Gaussian processes. A binary classifier selects one Gaussian process to predict the optimal control signal. We show that GPDP is able to yield an almost optimal solution to an LQ problem using few sample points. Moreover, we successfully apply GPDP to the underpowered pendulum swing up, a complex nonlinear control problem.
引用
收藏
页码:4480 / +
页数:2
相关论文
共 50 条
  • [21] Dynamic Programming for Approximate Expansion Algorithm
    Veksler, Olga
    COMPUTER VISION - ECCV 2012, PT III, 2012, 7574 : 850 - 863
  • [22] Approximate dynamic programming for stochastic reachability
    Kariotoglou, Nikolaos
    Summers, Sean
    Summers, Tyler
    Kamgarpour, Maryam
    Lygeros, John
    2013 EUROPEAN CONTROL CONFERENCE (ECC), 2013, : 584 - 589
  • [23] Approximate dynamic programming for container stacking
    Boschma, Rene
    Mes, Martijn R. K.
    de Vries, Leon R.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 310 (01) : 328 - 342
  • [24] Feature Discovery in Approximate Dynamic Programming
    Preux, Philippe
    Girgin, Sertan
    Loth, Manuel
    ADPRL: 2009 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2009, : 109 - +
  • [25] Approximate dynamic programming with a fuzzy parameterization
    Busoniu, Lucian
    Ernst, Damien
    De Schutter, Bart
    Babuska, Robert
    AUTOMATICA, 2010, 46 (05) : 804 - 814
  • [26] Bayesian Exploration for Approximate Dynamic Programming
    Ryzhov, Ilya O.
    Mes, Martijn R. K.
    Powell, Warren B.
    van den Berg, Gerald
    OPERATIONS RESEARCH, 2019, 67 (01) : 198 - 214
  • [27] APPROXIMATE UPPER FUNCTIONS OF GAUSSIAN-PROCESSES
    KONO, N
    JOURNAL OF MATHEMATICS OF KYOTO UNIVERSITY, 1983, 23 (01): : 195 - 209
  • [28] On approximate dynamic programming in switching systems
    Rantzer, Anders
    2005 44TH IEEE CONFERENCE ON DECISION AND CONTROL & EUROPEAN CONTROL CONFERENCE, VOLS 1-8, 2005, : 1391 - 1396
  • [29] Approximate Dynamic Programming for Ambulance Redeployment
    Maxwell, Matthew S.
    Restrepo, Mateo
    Henderson, Shane G.
    Topaloglu, Huseyin
    INFORMS JOURNAL ON COMPUTING, 2010, 22 (02) : 266 - 281
  • [30] An Approximate Dynamic Programming Approach to Dynamic Stochastic Matching
    You, Fan
    Vossen, Thomas
    INFORMS JOURNAL ON COMPUTING, 2024, 36 (04) : 1006 - 1022