A mini-batch stochastic conjugate gradient algorithm with variance reduction

被引：0

作者：

Caixia Kou

Han Yang

机构：

[1] Beijing University of Posts and Telecommunications,School of Science

来源：

Journal of Global Optimization | 2023年 / 87卷

关键词：

Deep learning; Empirical risk minimization; Stochastic conjugate gradient; Linear convergence;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Stochastic gradient descent method is popular for large scale optimization but has slow convergence asymptotically due to the inherent variance. To remedy this problem, there have been many explicit variance reduction methods for stochastic descent, such as SVRG Johnson and Zhang [Advances in neural information processing systems, (2013), pp. 315–323], SAG Roux et al. [Advances in neural information processing systems, (2012), pp. 2663–2671], SAGA Defazio et al. [Advances in neural information processing systems, (2014), pp. 1646–1654] and so on. Conjugate gradient method, which has the same computation cost with gradient descent method, is considered. In this paper, in the spirit of SAGA, we propose a stochastic conjugate gradient algorithm which we call SCGA. With the Fletcher and Reeves type choices, we prove a linear convergence rate for smooth and strongly convex functions. We experimentally demonstrate that SCGA converges faster than the popular SGD type algorithms for four machine learning models, which may be convex, nonconvex or nonsmooth. Solving regression problems, SCGA is competitive with CGVR, which is the only one stochastic conjugate gradient algorithm with variance reduction so far, as we know.

引用

页码：1009 / 1025

页数：16

共 50 条

[41] MINI-BATCH RISK FORMS*
Dentcheva, Darinka
Ruszczynski, Andrzej
SIAM JOURNAL ON OPTIMIZATION, 2023, 33 (02) : 615 - 637
[42] Boundedness and Convergence of Mini-batch Gradient Method with Cyclic Dropconnect and Penalty
Jing, Junling
Jinhang, Cai
Zhang, Huisheng
Zhang, Wenxia
NEURAL PROCESSING LETTERS, 2024, 56 (02)
[43] Mini-batch gradient descent: faster convergence under data sparsity
Khirirat, Sarit
Feyzmahdavian, Hamid Reza
Johansson, Mikael
2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
[44] Mini-batch descent in semiflows
Corella, Alberto Dominguez
Hernandez, Martin
ESAIM-CONTROL OPTIMISATION AND CALCULUS OF VARIATIONS, 2025, 31
[45] Confidence Score based Mini-batch Skipping for CNN Training on Mini-batch Training Environment
Jo, Joongho
Park, Jongsun
2020 17TH INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC 2020), 2020, : 129 - 130
[46] Gaussian Process Parameter Estimation Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits
Chen, Hao
Zheng, Lili
Kontar, Raed Al
Raskutti, Garvesh
Journal of Machine Learning Research, 2022, 23
[47] Breast cancer detection using Histopathology Image with Mini-Batch Stochastic Gradient Descent and Convolutional Neural Network
Sasirekha, N.
Karuppaiah, Jayakumar
Shekhar, Himanshu
Saranya, N. Naga
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (03) : 4651 - 4667
[48] Gaussian Process Parameter Estimation Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits
Chen, Hao
Zheng, Lili
Al Kontar, Raed
Raskutti, Garvesh
JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
[49] Mini-Batch Stochastic Three-Operator Splitting for Distributed Optimization
Franci, Barbara
Staudigl, Mathias
IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 2882 - 2887
[50] Deterministic convergence of complex mini-batch gradient learning algorithm for fully complex-valued neural networks
Zhang, Huisheng
Zhang, Ying
Zhu, Shuai
Xu, Dongpo
NEUROCOMPUTING, 2020, 407 : 185 - 193

← 1 2 3 4 5 →