A mini-batch stochastic conjugate gradient algorithm with variance reduction

被引:0
|
作者
Caixia Kou
Han Yang
机构
[1] Beijing University of Posts and Telecommunications,School of Science
来源
关键词
Deep learning; Empirical risk minimization; Stochastic conjugate gradient; Linear convergence;
D O I
暂无
中图分类号
学科分类号
摘要
Stochastic gradient descent method is popular for large scale optimization but has slow convergence asymptotically due to the inherent variance. To remedy this problem, there have been many explicit variance reduction methods for stochastic descent, such as SVRG Johnson and Zhang [Advances in neural information processing systems, (2013), pp. 315–323], SAG Roux et al. [Advances in neural information processing systems, (2012), pp. 2663–2671], SAGA Defazio et al. [Advances in neural information processing systems, (2014), pp. 1646–1654] and so on. Conjugate gradient method, which has the same computation cost with gradient descent method, is considered. In this paper, in the spirit of SAGA, we propose a stochastic conjugate gradient algorithm which we call SCGA. With the Fletcher and Reeves type choices, we prove a linear convergence rate for smooth and strongly convex functions. We experimentally demonstrate that SCGA converges faster than the popular SGD type algorithms for four machine learning models, which may be convex, nonconvex or nonsmooth. Solving regression problems, SCGA is competitive with CGVR, which is the only one stochastic conjugate gradient algorithm with variance reduction so far, as we know.
引用
收藏
页码:1009 / 1025
页数:16
相关论文
共 50 条
  • [41] MINI-BATCH RISK FORMS*
    Dentcheva, Darinka
    Ruszczynski, Andrzej
    SIAM JOURNAL ON OPTIMIZATION, 2023, 33 (02) : 615 - 637
  • [42] Boundedness and Convergence of Mini-batch Gradient Method with Cyclic Dropconnect and Penalty
    Jing, Junling
    Jinhang, Cai
    Zhang, Huisheng
    Zhang, Wenxia
    NEURAL PROCESSING LETTERS, 2024, 56 (02)
  • [43] Mini-batch gradient descent: faster convergence under data sparsity
    Khirirat, Sarit
    Feyzmahdavian, Hamid Reza
    Johansson, Mikael
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [44] Mini-batch descent in semiflows
    Corella, Alberto Dominguez
    Hernandez, Martin
    ESAIM-CONTROL OPTIMISATION AND CALCULUS OF VARIATIONS, 2025, 31
  • [45] Confidence Score based Mini-batch Skipping for CNN Training on Mini-batch Training Environment
    Jo, Joongho
    Park, Jongsun
    2020 17TH INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC 2020), 2020, : 129 - 130
  • [46] Gaussian Process Parameter Estimation Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits
    Chen, Hao
    Zheng, Lili
    Kontar, Raed Al
    Raskutti, Garvesh
    Journal of Machine Learning Research, 2022, 23
  • [47] Breast cancer detection using Histopathology Image with Mini-Batch Stochastic Gradient Descent and Convolutional Neural Network
    Sasirekha, N.
    Karuppaiah, Jayakumar
    Shekhar, Himanshu
    Saranya, N. Naga
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (03) : 4651 - 4667
  • [48] Gaussian Process Parameter Estimation Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits
    Chen, Hao
    Zheng, Lili
    Al Kontar, Raed
    Raskutti, Garvesh
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [49] Mini-Batch Stochastic Three-Operator Splitting for Distributed Optimization
    Franci, Barbara
    Staudigl, Mathias
    IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 2882 - 2887
  • [50] Deterministic convergence of complex mini-batch gradient learning algorithm for fully complex-valued neural networks
    Zhang, Huisheng
    Zhang, Ying
    Zhu, Shuai
    Xu, Dongpo
    NEUROCOMPUTING, 2020, 407 : 185 - 193