A mini-batch stochastic conjugate gradient algorithm with variance reduction

被引:0
|
作者
Caixia Kou
Han Yang
机构
[1] Beijing University of Posts and Telecommunications,School of Science
来源
关键词
Deep learning; Empirical risk minimization; Stochastic conjugate gradient; Linear convergence;
D O I
暂无
中图分类号
学科分类号
摘要
Stochastic gradient descent method is popular for large scale optimization but has slow convergence asymptotically due to the inherent variance. To remedy this problem, there have been many explicit variance reduction methods for stochastic descent, such as SVRG Johnson and Zhang [Advances in neural information processing systems, (2013), pp. 315–323], SAG Roux et al. [Advances in neural information processing systems, (2012), pp. 2663–2671], SAGA Defazio et al. [Advances in neural information processing systems, (2014), pp. 1646–1654] and so on. Conjugate gradient method, which has the same computation cost with gradient descent method, is considered. In this paper, in the spirit of SAGA, we propose a stochastic conjugate gradient algorithm which we call SCGA. With the Fletcher and Reeves type choices, we prove a linear convergence rate for smooth and strongly convex functions. We experimentally demonstrate that SCGA converges faster than the popular SGD type algorithms for four machine learning models, which may be convex, nonconvex or nonsmooth. Solving regression problems, SCGA is competitive with CGVR, which is the only one stochastic conjugate gradient algorithm with variance reduction so far, as we know.
引用
收藏
页码:1009 / 1025
页数:16
相关论文
共 50 条
  • [21] Adaptive Natural Gradient Method for Learning of Stochastic Neural Networks in Mini-Batch Mode
    Park, Hyeyoung
    Lee, Kwanyong
    APPLIED SCIENCES-BASEL, 2019, 9 (21):
  • [22] Stronger Adversarial Attack: Using Mini-batch Gradient
    Yu, Lin
    Deng, Ting
    Zhang, Wenxiang
    Zeng, Zhigang
    2020 12TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2020, : 364 - 370
  • [23] Gradient preconditioned mini-batch SGD for ridge regression
    Zhang, Zhuan
    Zhou, Shuisheng
    Li, Dong
    Yang, Ting
    NEUROCOMPUTING, 2020, 413 : 284 - 293
  • [24] Mini-batch stochastic subgradient for functional constrained optimization
    Singh, Nitesh Kumar
    Necoara, Ion
    Kungurtsev, Vyacheslav
    OPTIMIZATION, 2024, 73 (07) : 2159 - 2185
  • [25] Scalable Hardware Accelerator for Mini-Batch Gradient Descent
    Rasoori, Sandeep
    Akella, Venkatesh
    PROCEEDINGS OF THE 2018 GREAT LAKES SYMPOSIUM ON VLSI (GLSVLSI'18), 2018, : 159 - 164
  • [26] A Learning Algorithm with a Gradient Normalization and a Learning Rate Adaptation for the Mini-batch Type Learning
    Ito, Daiki
    Okamoto, Takashi
    Koakutsu, Seiichi
    2017 56TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2017, : 811 - 816
  • [27] AN OPTIMAL DESIGN OF RANDOM SURFACES IN SOLAR CELLS VIA MINI-BATCH STOCHASTIC GRADIENT APPROACH
    Wang, Dan
    Li, Qiang
    Shen, Jihong
    COMMUNICATIONS IN MATHEMATICAL SCIENCES, 2022, 20 (03) : 747 - 762
  • [28] Automatic Setting of Learning Rate and Mini-batch Size in Momentum and AdaM Stochastic Gradient Methods
    Franchini, Giorgia
    Porta, Federica
    INTERNATIONAL CONFERENCE ON NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2022, ICNAAM-2022, 2024, 3094
  • [29] Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization
    Ghadimi, Saeed
    Lan, Guanghui
    Zhang, Hongchao
    MATHEMATICAL PROGRAMMING, 2016, 155 (1-2) : 267 - 305
  • [30] Efficient distance metric learning by adaptive sampling and mini-batch stochastic gradient descent (SGD)
    Qian, Qi
    Jin, Rong
    Yi, Jinfeng
    Zhang, Lijun
    Zhu, Shenghuo
    MACHINE LEARNING, 2015, 99 (03) : 353 - 372