A mini-batch stochastic conjugate gradient algorithm with variance reduction

被引:0
|
作者
Caixia Kou
Han Yang
机构
[1] Beijing University of Posts and Telecommunications,School of Science
来源
关键词
Deep learning; Empirical risk minimization; Stochastic conjugate gradient; Linear convergence;
D O I
暂无
中图分类号
学科分类号
摘要
Stochastic gradient descent method is popular for large scale optimization but has slow convergence asymptotically due to the inherent variance. To remedy this problem, there have been many explicit variance reduction methods for stochastic descent, such as SVRG Johnson and Zhang [Advances in neural information processing systems, (2013), pp. 315–323], SAG Roux et al. [Advances in neural information processing systems, (2012), pp. 2663–2671], SAGA Defazio et al. [Advances in neural information processing systems, (2014), pp. 1646–1654] and so on. Conjugate gradient method, which has the same computation cost with gradient descent method, is considered. In this paper, in the spirit of SAGA, we propose a stochastic conjugate gradient algorithm which we call SCGA. With the Fletcher and Reeves type choices, we prove a linear convergence rate for smooth and strongly convex functions. We experimentally demonstrate that SCGA converges faster than the popular SGD type algorithms for four machine learning models, which may be convex, nonconvex or nonsmooth. Solving regression problems, SCGA is competitive with CGVR, which is the only one stochastic conjugate gradient algorithm with variance reduction so far, as we know.
引用
收藏
页码:1009 / 1025
页数:16
相关论文
共 50 条
  • [31] Efficient distance metric learning by adaptive sampling and mini-batch stochastic gradient descent (SGD)
    Qi Qian
    Rong Jin
    Jinfeng Yi
    Lijun Zhang
    Shenghuo Zhu
    Machine Learning, 2015, 99 : 353 - 372
  • [32] Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent
    Li, Fengan
    Chen, Lingjiao
    Zeng, Yijing
    Kumar, Arun
    Wu, Xi
    Naughton, Jeffrey F.
    Patel, Jignesh M.
    SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 1517 - 1534
  • [33] Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization
    Saeed Ghadimi
    Guanghui Lan
    Hongchao Zhang
    Mathematical Programming, 2016, 155 : 267 - 305
  • [34] Research on Mini-Batch Affinity Propagation Clustering Algorithm
    Xu, Ziqi
    Lu, Yahui
    Jiang, Yu
    2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2022, : 86 - 95
  • [35] Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator
    Qi, Haobo
    Wang, Feifei
    Wang, Hansheng
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2023, 32 (04) : 1348 - 1360
  • [36] HYPERSPECTRAL UNMIXING VIA PROJECTED MINI-BATCH GRADIENT DESCENT
    Li, Jing
    Li, Xiaorun
    Zhao, Liaoying
    2017 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2017, : 1133 - 1136
  • [37] Carbon Emission Forecasting Study Based on Influence Factor Mining and Mini-Batch Stochastic Gradient Optimization
    Yang, Wei
    Yuan, Qiheng
    Wang, Yongli
    Zheng, Fei
    Shi, Xin
    Li, Yi
    ENERGIES, 2024, 17 (01)
  • [38] Staleness-Reduction Mini-Batch K-Means
    Zhu, Xueying
    Sun, Jie
    He, Zhenhao
    Jiang, Jiantong
    Wang, Zeke
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 14424 - 14436
  • [39] Mini-Batch Spectral Clustering
    Han, Yufei
    Filippone, Maurizio
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 3888 - 3895
  • [40] Boundedness and Convergence of Mini-batch Gradient Method with Cyclic Dropconnect and Penalty
    Junling Jing
    Cai Jinhang
    Huisheng Zhang
    Wenxia Zhang
    Neural Processing Letters, 56