Least squares solutions of the HJB equation with neural network value-function approximators

被引:47
|
作者
Tassa, Yuval [1 ]
Erez, Tom
机构
[1] Hebrew Univ Jerusalem, Interdisciplinary Ctr Neural Computat, IL-91904 Jerusalem, Israel
[2] Washington Univ, Dept Comp Sci & Engn, St Louis, MO 63130 USA
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2007年 / 18卷 / 04期
关键词
differential neural networks (NNs); dynamic programming; feedforward neural networks; Hamilton-Jacoby-Bellman (HJB) equation; optimal control; viscosity solution;
D O I
10.1109/TNN.2007.899249
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present an empirical study of iterative least squares minimization of the Hamilton-jacobi-Bellman (HJB) residual with a neural network (NN) approximation of the value function. Although the nonlinearities in the optimal control problem and NN approximator preclude theoretical guarantees and raise concerns of numerical instabilities, we present two simple methods for promoting convergence, the effectiveness of which is presented in a series of experiments. The first method involves the gradual increase of the horizon time scale, with a corresponding gradual increase in value function complexity. The second method involves the assumption of stochastic dynamics which introduces a regularizing second derivative term to the HJB equation. A gradual reduction of this term provides further stabilization of the convergence. We demonstrate the solution of several problems, including the 4-D inverted-pendulum system with bounded control. Our approach requires no initial stabilizing policy or any restrictive assumptions on the plant or cost function, only knowledge of the plant dynamics. In the Appendix, we provide the equations for first- and second-order differential backpropagation.
引用
收藏
页码:1031 / 1041
页数:11
相关论文
共 50 条
  • [31] Convolutional Neural Network-Assisted Least-Squares Migration
    Boming Wu
    Hao Hu
    Hua-Wei Zhou
    Surveys in Geophysics, 2023, 44 : 1107 - 1124
  • [32] Approximate Solutions to Poisson Equation Using Least Squares Support Vector Machines
    Wu, Ziku
    Liu, Zhenbin
    Li, Fule
    Yu, Jiaju
    BOUNDARY AND INTERIOR LAYERS, COMPUTATIONAL AND ASYMPTOTIC METHODS, BAIL 2016, 2017, 120 : 197 - 203
  • [33] Least squares solutions with special structure to the linear matrix equation AXB = C
    Zhang, Fengxia
    Li, Ying
    Guo, Wenbin
    Zhao, Jianli
    APPLIED MATHEMATICS AND COMPUTATION, 2011, 217 (24) : 10049 - 10057
  • [34] Deep convolutional neural network and sparse least-squares migration
    Liu, Zhaolun
    Chen, Yuqing
    Schuster, Gerard
    GEOPHYSICS, 2020, 85 (04) : WA241 - WA253
  • [35] A neural network based data least squares algorithm for channel equalization
    Lim, Jun-Seok
    New Trends in Applied Artificial Intelligence, Proceedings, 2007, 4570 : 582 - 590
  • [36] Cascade Principal Component Least Squares Neural Network Learning Algorithm
    Khan, Waqar Ahmed
    Chung, Sai-Ho
    Chan, Ching Yuen
    2018 24TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND COMPUTING (ICAC' 18), 2018, : 56 - 61
  • [37] Least squares B-spline solutions of the radial Dirac equation in the continuum
    Toffoli, D
    Decleva, P
    COMPUTER PHYSICS COMMUNICATIONS, 2003, 152 (02) : 151 - 164
  • [38] Convolutional Neural Network-Assisted Least-Squares Migration
    Wu, Boming
    Hu, Hao
    Zhou, Hua-Wei
    SURVEYS IN GEOPHYSICS, 2023, 44 (04) : 1107 - 1124
  • [39] Deep convolutional neural network and sparse least-squares migration
    Liu Z.
    Chen Y.
    Schuster G.
    Geophysics, 2020, 85 (04): : WA241 - WA253
  • [40] ITERATIVE ALGORITHMS FOR THE SYMMETRIC AND LEAST-SQUARES SYMMETRIC SOLUTIONS OF A TENSOR EQUATION
    Meng, Qun
    Xie, Yu-Zhu
    JP JOURNAL OF ALGEBRA NUMBER THEORY AND APPLICATIONS, 2021, 50 (02): : 179 - 212