A fully stochastic second-order trust region method

被引:7
|
作者
Curtis, Frank E. [1 ]
Shi, Rui [1 ]
机构
[1] Lehigh Univ, Dept Ind & Syst Engn, Bethlehem, PA 18015 USA
来源
OPTIMIZATION METHODS & SOFTWARE | 2022年 / 37卷 / 03期
基金
美国国家科学基金会;
关键词
Stochastic optimization; finite-sum optimization; stochastic Newton methods; trust region methods; machine learning; deep neural networks; time series forecasting; QUASI-NEWTON METHOD; OPTIMIZATION METHODS;
D O I
10.1080/10556788.2020.1852403
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A stochastic second-order trust region method is proposed, which can be viewed as an extension of the trust-region-ish (TRish) algorithm proposed by Curtis et al. [A stochastic trust region algorithm based on careful step normalization. INFORMS J. Optim. 1(3) 200-220, 2019]. In each iteration, a search direction is computed by (approximately) solving a subproblem defined by stochastic gradient and Hessian estimates. The algorithm has convergence guarantees in the fully stochastic regime, i.e. when each stochastic gradient is merely an unbiased estimate of the gradient with bounded variance and the stochastic Hessian estimates are bounded. This framework covers a variety of implementations, such as when the stochastic Hessians are defined by sampled second-order derivatives or diagonal matrices, such as in RMSprop, Adagrad, Adam and other popular algorithms. The proposed algorithm has a worst-case complexity guarantee in the nearly deterministic regime, i.e. when the stochastic gradients and Hessians are close in expectation to the true gradients and Hessians. The results of numerical experiments for training CNNs for image classification and an RNN for time series forecasting are presented. These results show that the algorithm can outperform a stochastic gradient and first-order TRish algorithm.
引用
收藏
页码:844 / 877
页数:34
相关论文
共 50 条
  • [22] Second-order statistics of stochastic spline signals
    Neubauer, A
    SIGNAL PROCESSING, 2004, 84 (08) : 1395 - 1413
  • [23] Parametric control for a second-order stochastic system
    Iourtchenko, D.V.
    Izvestiya Akademii Nauk. Teoriya i Sistemy Upravleniya, 2004, (01): : 84 - 88
  • [24] Unified second-order stochastic averaging approach
    Hijawi, M.
    Moschuk, N.
    Ibrahim, R.A.
    Journal of Applied Mechanics, Transactions ASME, 1997, 64 (02): : 281 - 291
  • [25] SECOND-ORDER STOCHASTIC DOMINANT PORTFOLIO REBALANCING
    Opartpunyasarn, Rungnapa
    Chatsanga, Nonthachote
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2019, 15 (05): : 1729 - 1745
  • [26] Perturbed Second-Order Stochastic Evolution Equations
    Lijuan Cheng
    Yong Ren
    Qualitative Theory of Dynamical Systems, 2021, 20
  • [27] Unified second-order stochastic averaging approach
    Hijawi, M
    Moschuk, N
    Ibrahim, RA
    JOURNAL OF APPLIED MECHANICS-TRANSACTIONS OF THE ASME, 1997, 64 (02): : 281 - 291
  • [28] Perturbed Second-Order Stochastic Evolution Equations
    Cheng, Lijuan
    Ren, Yong
    QUALITATIVE THEORY OF DYNAMICAL SYSTEMS, 2021, 20 (02)
  • [29] A new characterization of second-order stochastic dominance
    Guan, Yuanying
    Huang, Muqiao
    Wang, Ruodu
    INSURANCE MATHEMATICS & ECONOMICS, 2024, 119 : 261 - 267
  • [30] Asymmetric second-order stochastic resonance weak fault feature extraction method
    Tang, Jiachen
    Shi, Boqiang
    MEASUREMENT & CONTROL, 2020, 53 (5-6): : 788 - 795