A fully stochastic second-order trust region method

被引：7

作者：

Curtis, Frank E. ^{[1
]}

Shi, Rui ^{[1
]}

机构：

[1] Lehigh Univ, Dept Ind & Syst Engn, Bethlehem, PA 18015 USA

来源：

OPTIMIZATION METHODS & SOFTWARE | 2022年 / 37卷 / 03期

基金：

美国国家科学基金会;

关键词：

Stochastic optimization; finite-sum optimization; stochastic Newton methods; trust region methods; machine learning; deep neural networks; time series forecasting; QUASI-NEWTON METHOD; OPTIMIZATION METHODS;

D O I：

10.1080/10556788.2020.1852403

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

A stochastic second-order trust region method is proposed, which can be viewed as an extension of the trust-region-ish (TRish) algorithm proposed by Curtis et al. [A stochastic trust region algorithm based on careful step normalization. INFORMS J. Optim. 1(3) 200-220, 2019]. In each iteration, a search direction is computed by (approximately) solving a subproblem defined by stochastic gradient and Hessian estimates. The algorithm has convergence guarantees in the fully stochastic regime, i.e. when each stochastic gradient is merely an unbiased estimate of the gradient with bounded variance and the stochastic Hessian estimates are bounded. This framework covers a variety of implementations, such as when the stochastic Hessians are defined by sampled second-order derivatives or diagonal matrices, such as in RMSprop, Adagrad, Adam and other popular algorithms. The proposed algorithm has a worst-case complexity guarantee in the nearly deterministic regime, i.e. when the stochastic gradients and Hessians are close in expectation to the true gradients and Hessians. The results of numerical experiments for training CNNs for image classification and an RNN for time series forecasting are presented. These results show that the algorithm can outperform a stochastic gradient and first-order TRish algorithm.

引用

页码：844 / 877

页数：34

共 50 条

[1] A fully stochastic second-order trust region method
Curtis, Frank E.
Shi, Rui
Optimization Methods and Software, 2022, 37 (03): : 844 - 877
[2] A TRUST REGION METHOD FOR FINDING SECOND-ORDER STATIONARITY IN LINEARLY CONSTRAINED NONCONVEX OPTIMIZATION
Nouiehed, Maher
Razaviyayn, Meisam
SIAM JOURNAL ON OPTIMIZATION, 2020, 30 (03) : 2501 - 2529
[3] A trust region SQP-filter method for nonlinear second-order cone programming
Zhang, Xiangsong
Liu, Zhenhua
Liu, Sanyang
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2012, 63 (12) : 1569 - 1576
[4] An Accelerated Second-Order Method for Distributed Stochastic Optimization
Agafonov, Artem
Dvurechensky, Pavel
Scutari, Gesualdo
Gasnikov, Alexander
Kamzolov, Dmitry
Lukashevich, Aleksandr
Daneshmand, Amir
2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 2407 - 2413
[5] A Second-Order Method for Stochastic Bandit Convex Optimisation
Lattimore, Tor
Gyorgy, Andras
THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
[6] A Stochastic Second-Order Proximal Method for Distributed Optimization
Qiu, Chenyang
Zhu, Shanying
Ou, Zichong
Lu, Jie
IEEE CONTROL SYSTEMS LETTERS, 2023, 7 : 1405 - 1410
[7] A symmetric fully optimized second-order method for nonlinear homogenization
Furer, Joshua
Castaneda, Pedro Ponte
ZAMM-ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 2018, 98 (02): : 222 - 254
[8] Second-order backward stochastic differential equations and fully nonlinear parabolic PDEs
Cheridito, Patrick
Soner, H. Mete
Touzi, Nizar
Victoir, Nicolas
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS, 2007, 60 (07) : 1081 - 1110
[9] Fully differential second-order filter
Al-Shahrani, SM
2004 47TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL III, CONFERENCE PROCEEDINGS, 2004, : 299 - 301
[10] Fast stochastic second-order method logarithmic in condition number
Ye, Haishan
Xie, Guangzeng
Luo, Luo
Zhang, Zhihua
PATTERN RECOGNITION, 2019, 88 : 629 - 642

← 1 2 3 4 5 →