NEWTON METHOD FOR STOCHASTIC CONTROL PROBLEMS

被引:3
|
作者
Gobet, Emmanuel [1 ]
Grangereau, Maxime [1 ,2 ]
机构
[1] Inst Polytech Paris, Ecole Polytech, CNRS, Ctr Math Appl CMAP, F-91128 Palaiseau, France
[2] Elect France EDF Lab Paris Saclay, F-91120 Palaiseau, France
关键词
Newton method; stochastic optimal control; Forward-Backward Stochastic Differential Equations; Backward Stochastic Differential Equations; empirical regression; energy management; DIFFERENTIAL-EQUATIONS; DISCRETIZATION;
D O I
10.1137/21M1408567
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We develop a new iterative method based on the Pontryagin principle to solve stochastic control problems. This method is nothing other than the Newton method extended to the framework of stochastic optimal control, where the state dynamics are given by an ODEs with stochastic coefficients and the cost is random. Each iteration of the method is made of two ingredients: computing the Newton direction, and finding an adapted step length. The Newton direction is obtained by solving an affine-linear Forward-Backward Stochastic Differential Equation (FBSDE) with random coefficients. This is done in the setting of a general filtration. Solving such an FBSDE reduces to solving a Riccati Backward Stochastic Differential Equation (BSDE) and an affine-linear BSDE, as expected in the framework of linear FBSDEs or Linear-Quadratic stochastic control problems. We then establish convergence results for this Newton method. In particular, Lipschitz-continuity of the second-order derivative of the cost functional is established with an appropriate choice of norm and under boundedness assumptions, which is sufficient to prove (local) quadratic convergence of the method in the space of uniformly bounded processes. To choose an appropriate step length while fitting our choice of space of processes, an adapted Backtracking line search method is developed. We then prove global convergence of the Newton method with the proposed line search procedure, which occurs at a quadratic rate after finitely many iterations. An implementation with regression techniques to solve BSDEs arising in the computation of the Newton step is developed. We apply it to the control problem of a large number of batteries providing ancillary services to an electricity network.
引用
收藏
页码:2996 / 3025
页数:30
相关论文
共 50 条