Least squares solutions of the HJB equation with neural network value-function approximators

被引:47
|
作者
Tassa, Yuval [1 ]
Erez, Tom
机构
[1] Hebrew Univ Jerusalem, Interdisciplinary Ctr Neural Computat, IL-91904 Jerusalem, Israel
[2] Washington Univ, Dept Comp Sci & Engn, St Louis, MO 63130 USA
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2007年 / 18卷 / 04期
关键词
differential neural networks (NNs); dynamic programming; feedforward neural networks; Hamilton-Jacoby-Bellman (HJB) equation; optimal control; viscosity solution;
D O I
10.1109/TNN.2007.899249
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present an empirical study of iterative least squares minimization of the Hamilton-jacobi-Bellman (HJB) residual with a neural network (NN) approximation of the value function. Although the nonlinearities in the optimal control problem and NN approximator preclude theoretical guarantees and raise concerns of numerical instabilities, we present two simple methods for promoting convergence, the effectiveness of which is presented in a series of experiments. The first method involves the gradual increase of the horizon time scale, with a corresponding gradual increase in value function complexity. The second method involves the assumption of stochastic dynamics which introduces a regularizing second derivative term to the HJB equation. A gradual reduction of this term provides further stabilization of the convergence. We demonstrate the solution of several problems, including the 4-D inverted-pendulum system with bounded control. Our approach requires no initial stabilizing policy or any restrictive assumptions on the plant or cost function, only knowledge of the plant dynamics. In the Appendix, we provide the equations for first- and second-order differential backpropagation.
引用
收藏
页码:1031 / 1041
页数:11
相关论文
共 50 条
  • [41] A Neural Network Approach for Least Squares Support Vector Machines Learning
    Liu, Han
    Liu, Ding
    PROCEEDINGS OF THE 48TH IEEE CONFERENCE ON DECISION AND CONTROL, 2009 HELD JOINTLY WITH THE 2009 28TH CHINESE CONTROL CONFERENCE (CDC/CCC 2009), 2009, : 7297 - 7302
  • [42] Least squares Toeplitzmatrix solutions of the matrix equation AXB plus CYD = F
    Yuan, Yongxin
    Zhao, Wenhua
    Liu, Hao
    LINEAR & MULTILINEAR ALGEBRA, 2017, 65 (09): : 1867 - 1877
  • [43] Numerical strategies for recursive least squares solutions to the matrix equation AX = B
    Hadjiantoni, Stella
    Loizou, George
    INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2023, 100 (03) : 497 - 510
  • [44] Several kinds of special least squares solutions to quaternionmatrix equation AXB = C
    Wang, Dong
    Li, Ying
    Ding, Wenxv
    JOURNAL OF APPLIED MATHEMATICS AND COMPUTING, 2022, 68 (03) : 1881 - 1899
  • [45] Special least squares solutions of the quaternion matrix equation AX = B with applications
    Zhang, Fengxia
    Wei, Musheng
    Li, Ying
    Zhao, Jianli
    APPLIED MATHEMATICS AND COMPUTATION, 2015, 270 : 425 - 433
  • [46] A Neural Network for Weighted Least-Squares Criteria of Traveltime Tomography
    Ma Ning
    Hu Zhengyi
    Wang YanpjngCollege of Electronic InformationWuhan UniversityWuhan China
    Wuhan University Journal of Natural Sciences, 1996, (02)
  • [47] Properties of value function and existence of viscosity solution of HJB equation for stochastic boundary control problems
    Yu, Huaiqiang
    Liu, Bin
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2011, 348 (08): : 2108 - 2127
  • [48] Relations between least-squares and least-rank solutions of the matrix equation AXB = C
    Tian, Yongge
    Wang, Hongxing
    APPLIED MATHEMATICS AND COMPUTATION, 2013, 219 (20) : 10293 - 10301
  • [49] Towards Co-designing Neural Network Function Approximators with In-SRAM Computing
    Nasrin, Shamma
    Badawi, Diaa
    Cetin, Ahmet Enis
    Gomes, Wilfred
    Trivedi, Amit Ranjan
    FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 40 - 43
  • [50] Polynomial Least Squares Method for Approx Solutions of Fractional Boundary Value Prob
    Lapadat, Marioara
    Pasca, Madalina Sofia
    INTERNATIONAL CONFERENCE ON NUMERICAL ANALYSIS AND APPLIED MATHEMATICS ICNAAM 2019, 2020, 2293