A Line Search Based Proximal Stochastic Gradient Algorithm with Dynamical Variance Reduction

被引:8
|
作者
Franchini, Giorgia [1 ]
Porta, Federica [1 ]
Ruggiero, Valeria [2 ]
Trombini, Ilaria [2 ,3 ]
机构
[1] Univ Modena & Reggio Emilia, Dept Phys Informat & Math, Via Campi 213-B, I-41125 Modena, Italy
[2] Univ Ferrara, Dept Math & Comp Sci, Via Machiavelli 30, I-44121 Ferrara, Italy
[3] Univ Parma, Dept Math Phys & Comp Sci, Parco Area Sci 7-A, I-43124 Parma, Italy
关键词
First order stochastic methods; Stochastic proximal methods; Machine learning; Green artificial intelligence; CONVERGENCE;
D O I
10.1007/s10915-022-02084-3
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Many optimization problems arising from machine learning applications can be cast as the minimization of the sum of two functions: the first one typically represents the expected risk, and in practice it is replaced by the empirical risk, and the other one imposes a priori information on the solution. Since in general the first term is differentiable and the second one is convex, proximal gradient methods are very well suited to face such optimization problems. However, when dealing with large-scale machine learning issues, the computation of the full gradient of the differentiable term can be prohibitively expensive by making these algorithms unsuitable. For this reason, proximal stochastic gradient methods have been extensively studied in the optimization area in the last decades. In this paper we develop a proximal stochastic gradient algorithm which is based on two main ingredients. We indeed combine a proper technique to dynamically reduce the variance of the stochastic gradients along the iterative process with a descent condition in expectation for the objective function, aimed to fix the value for the steplength parameter at each iteration. For general objective functionals, the a.s. convergence of the limit points of the sequence generated by the proposed scheme to stationary points can be proved. For convex objective functionals, both the a.s. convergence of the whole sequence of the iterates to a minimum point and an O(1/k) convergence rate for the objective function values have been shown. The practical implementation of the proposed method does not need neither the computation of the exact gradient of the empirical risk during the iterations nor the tuning of an optimal value for the step length. An extensive numerical experimentation highlights that the proposed approach appears robust with respect to the setting of the hyper parameters and competitive compared to state-of-the-art methods.
引用
收藏
页数:35
相关论文
共 50 条
  • [31] A Stochastic Composite Gradient Method with Incremental Variance Reduction
    Zhang, Junyu
    Xiao, Lin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [32] On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo
    Chatterji, Niladri S.
    Flammarion, Nicolas
    Ma, Yi-An
    Bartlett, Peter L.
    Jordan, Michael I.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [33] Momentum-Based Variance-Reduced Proximal Stochastic Gradient Method for Composite Nonconvex Stochastic Optimization
    Xu, Yangyang
    Xu, Yibo
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2023, 196 (01) : 266 - 297
  • [34] Momentum-Based Variance-Reduced Proximal Stochastic Gradient Method for Composite Nonconvex Stochastic Optimization
    Yangyang Xu
    Yibo Xu
    Journal of Optimization Theory and Applications, 2023, 196 : 266 - 297
  • [35] VARIANCE-BASED EXTRAGRADIENT METHODS WITH LINE SEARCH FOR STOCHASTIC VARIATIONAL INEQUALITIES
    Iusem, Alfredo N.
    Jofre, Alejandro
    Oliveira, Roberto, I
    Thompson, Philip
    SIAM JOURNAL ON OPTIMIZATION, 2019, 29 (01) : 175 - 206
  • [36] STOCHASTIC ALTERNATING STRUCTURE-ADAPTED PROXIMAL GRADIENT DESCENT METHOD WITH VARIANCE REDUCTION FOR NONCONVEX NONSMOOTH OPTIMIZATION
    Jia, Zehui
    Zhang, Wenxing
    Cai, Xingju
    Han, Deren
    MATHEMATICS OF COMPUTATION, 2024, 93 (348) : 1677 - 1714
  • [37] An Edge-based Stochastic Proximal Gradient Algorithm for Decentralized Composite Optimization
    Ling Zhang
    Yu Yan
    Zheng Wang
    Huaqing Li
    International Journal of Control, Automation and Systems, 2021, 19 : 3598 - 3610
  • [38] An Edge-based Stochastic Proximal Gradient Algorithm for Decentralized Composite Optimization
    Zhang, Ling
    Yan, Yu
    Wang, Zheng
    Li, Huaqing
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2021, 19 (11) : 3598 - 3610
  • [39] On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants
    Reddi, Sashank J.
    Hefny, Ahmed
    Sra, Suvrit
    Poczos, Barnabas
    Smola, Alex
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [40] Learning rate selection in stochastic gradient methods based on line search strategies
    Franchini, Giorgia
    Porta, Federica
    Ruggiero, Valeria
    Trombini, Ilaria
    Zanni, Luca
    APPLIED MATHEMATICS IN SCIENCE AND ENGINEERING, 2023, 31 (01):