Least squares solutions of the HJB equation with neural network value-function approximators

被引:47
|
作者
Tassa, Yuval [1 ]
Erez, Tom
机构
[1] Hebrew Univ Jerusalem, Interdisciplinary Ctr Neural Computat, IL-91904 Jerusalem, Israel
[2] Washington Univ, Dept Comp Sci & Engn, St Louis, MO 63130 USA
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2007年 / 18卷 / 04期
关键词
differential neural networks (NNs); dynamic programming; feedforward neural networks; Hamilton-Jacoby-Bellman (HJB) equation; optimal control; viscosity solution;
D O I
10.1109/TNN.2007.899249
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present an empirical study of iterative least squares minimization of the Hamilton-jacobi-Bellman (HJB) residual with a neural network (NN) approximation of the value function. Although the nonlinearities in the optimal control problem and NN approximator preclude theoretical guarantees and raise concerns of numerical instabilities, we present two simple methods for promoting convergence, the effectiveness of which is presented in a series of experiments. The first method involves the gradual increase of the horizon time scale, with a corresponding gradual increase in value function complexity. The second method involves the assumption of stochastic dynamics which introduces a regularizing second derivative term to the HJB equation. A gradual reduction of this term provides further stabilization of the convergence. We demonstrate the solution of several problems, including the 4-D inverted-pendulum system with bounded control. Our approach requires no initial stabilizing policy or any restrictive assumptions on the plant or cost function, only knowledge of the plant dynamics. In the Appendix, we provide the equations for first- and second-order differential backpropagation.
引用
收藏
页码:1031 / 1041
页数:11
相关论文
共 50 条
  • [1] An adaptive least-squares collocation radial basis function method for the HJB equation
    Alwardi, H.
    Wang, S.
    Jennings, L. S.
    Richardson, S.
    JOURNAL OF GLOBAL OPTIMIZATION, 2012, 52 (02) : 305 - 322
  • [2] An adaptive least-squares collocation radial basis function method for the HJB equation
    Department of Mathematics, Nizwa College of Applied Sciences, PO Box 699, Nizwa 611, Oman
    不详
    不详
    J of Global Optim, 2 (305-322):
  • [3] An adaptive least-squares collocation radial basis function method for the HJB equation
    H. Alwardi
    S. Wang
    L. S. Jennings
    S. Richardson
    Journal of Global Optimization, 2012, 52 : 305 - 322
  • [4] Nearly optimal HJB solution for constrained input systems using a neural network least-squares approach
    Abu-Khalaf, M
    Lewis, FL
    PROCEEDINGS OF THE 41ST IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4, 2002, : 943 - 948
  • [5] SINGULAR VALUE DECOMPOSITION AND LEAST SQUARES SOLUTIONS
    GOLUB, GH
    REINSCH, C
    NUMERISCHE MATHEMATIK, 1970, 14 (05) : 403 - &
  • [6] GPS navigation solutions by analogue neural network least-squares processors
    Jwo, DJ
    JOURNAL OF NAVIGATION, 2005, 58 (01): : 105 - 118
  • [7] Fourier neural networks as function approximators and differential equation solvers
    Ngom, Marieme
    Marin, Oana
    STATISTICAL ANALYSIS AND DATA MINING, 2021, 14 (06) : 647 - 661
  • [8] Multilayer perceptrons as function approximators for analytical solutions of the diffusion equation
    Campisi, Laura D.
    COMPUTATIONAL GEOSCIENCES, 2015, 19 (04) : 769 - 780
  • [9] Multilayer perceptrons as function approximators for analytical solutions of the diffusion equation
    Laura D. Campisi
    Computational Geosciences, 2015, 19 : 769 - 780
  • [10] The least-squares solutions of the matrix equation A*XB
    Zhang, Huiting
    Yuan, Yuying
    Li, Sisi
    Yuan, Yongxin
    AIMS MATHEMATICS, 2022, 7 (03): : 3680 - 3691