A mini-batch stochastic conjugate gradient algorithm with variance reduction

被引：0

作者：

Caixia Kou

Han Yang

机构：

[1] Beijing University of Posts and Telecommunications,School of Science

来源：

Journal of Global Optimization | 2023年 / 87卷

关键词：

Deep learning; Empirical risk minimization; Stochastic conjugate gradient; Linear convergence;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Stochastic gradient descent method is popular for large scale optimization but has slow convergence asymptotically due to the inherent variance. To remedy this problem, there have been many explicit variance reduction methods for stochastic descent, such as SVRG Johnson and Zhang [Advances in neural information processing systems, (2013), pp. 315–323], SAG Roux et al. [Advances in neural information processing systems, (2012), pp. 2663–2671], SAGA Defazio et al. [Advances in neural information processing systems, (2014), pp. 1646–1654] and so on. Conjugate gradient method, which has the same computation cost with gradient descent method, is considered. In this paper, in the spirit of SAGA, we propose a stochastic conjugate gradient algorithm which we call SCGA. With the Fletcher and Reeves type choices, we prove a linear convergence rate for smooth and strongly convex functions. We experimentally demonstrate that SCGA converges faster than the popular SGD type algorithms for four machine learning models, which may be convex, nonconvex or nonsmooth. Solving regression problems, SCGA is competitive with CGVR, which is the only one stochastic conjugate gradient algorithm with variance reduction so far, as we know.

引用

页码：1009 / 1025

页数：16

共 50 条

[31] Efficient distance metric learning by adaptive sampling and mini-batch stochastic gradient descent (SGD)
Qi Qian
Rong Jin
Jinfeng Yi
Lijun Zhang
Shenghuo Zhu
Machine Learning, 2015, 99 : 353 - 372
[32] Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent
Li, Fengan
Chen, Lingjiao
Zeng, Yijing
Kumar, Arun
Wu, Xi
Naughton, Jeffrey F.
Patel, Jignesh M.
SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 1517 - 1534
[33] Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization
Saeed Ghadimi
Guanghui Lan
Hongchao Zhang
Mathematical Programming, 2016, 155 : 267 - 305
[34] Research on Mini-Batch Affinity Propagation Clustering Algorithm
Xu, Ziqi
Lu, Yahui
Jiang, Yu
2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2022, : 86 - 95
[35] Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator
Qi, Haobo
Wang, Feifei
Wang, Hansheng
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2023, 32 (04) : 1348 - 1360
[36] HYPERSPECTRAL UNMIXING VIA PROJECTED MINI-BATCH GRADIENT DESCENT
Li, Jing
Li, Xiaorun
Zhao, Liaoying
2017 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2017, : 1133 - 1136
[37] Carbon Emission Forecasting Study Based on Influence Factor Mining and Mini-Batch Stochastic Gradient Optimization
Yang, Wei
Yuan, Qiheng
Wang, Yongli
Zheng, Fei
Shi, Xin
Li, Yi
ENERGIES, 2024, 17 (01)
[38] Staleness-Reduction Mini-Batch K-Means
Zhu, Xueying
Sun, Jie
He, Zhenhao
Jiang, Jiantong
Wang, Zeke
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 14424 - 14436
[39] Mini-Batch Spectral Clustering
Han, Yufei
Filippone, Maurizio
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 3888 - 3895
[40] Boundedness and Convergence of Mini-batch Gradient Method with Cyclic Dropconnect and Penalty
Junling Jing
Cai Jinhang
Huisheng Zhang
Wenxia Zhang
Neural Processing Letters, 56

← 1 2 3 4 5 →