Low-Precision Stochastic Gradient Langevin Dynamics

被引:0
|
作者
Zhang, Ruqi [1 ]
Wilson, Andrew Gordon [2 ]
De Sa, Christopher [3 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
[2] NYU, New York, NY 10012 USA
[3] Cornell Univ, Ithaca, NY 14853 USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While low-precision optimization has been widely used to accelerate deep learning, low-precision sampling remains largely unexplored. As a consequence, sampling is simply infeasible in many large-scale scenarios, despite providing remarkable benefits to generalization and uncertainty estimation for neural networks. In this paper, we provide the first study of low-precision Stochastic Gradient Langevin Dynamics (SGLD), showing that its costs can be significantly reduced without sacrificing performance, due to its intrinsic ability to handle system noise. We prove that the convergence of low-precision SGLD with full-precision gradient accumulators is less affected by the quantization error than its SGD counterpart in the strongly convex setting. To further enable low-precision gradient accumulators, we develop a new quantization function for SGLD that preserves the variance in each update step. We demonstrate that low-precision SGLD achieves comparable performance to full-precision SGLD with only 8 bits on a variety of deep learning tasks.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Approximation to Stochastic Variance Reduced Gradient Langevin Dynamics by Stochastic Delay Differential Equations
    Chen, Peng
    Lu, Jianya
    Xu, Lihu
    APPLIED MATHEMATICS AND OPTIMIZATION, 2022, 85 (02):
  • [22] Approximation to Stochastic Variance Reduced Gradient Langevin Dynamics by Stochastic Delay Differential Equations
    Peng Chen
    Jianya Lu
    Lihu Xu
    Applied Mathematics & Optimization, 2022, 85
  • [23] On stochastic gradient Langevin dynamics with dependent data streams in the logconcave case
    Barkhagen, M.
    Chau, N. H.
    Moulines, E.
    Rasonyi, M.
    Sabanis, S.
    Zhang, Y.
    BERNOULLI, 2021, 27 (01) : 1 - 33
  • [24] Nonasymptotic Estimation of Risk Measures Using Stochastic Gradient Langevin Dynamics
    Chu, Jiarui
    Tangpi, Ludovic
    SIAM JOURNAL ON FINANCIAL MATHEMATICS, 2024, 15 (02): : 503 - 536
  • [25] HYBRID DETERMINISTIC-STOCHASTIC GRADIENT LANGEVIN DYNAMICS FOR BAYESIAN LEARNING
    He, Qi
    Xin, Jack
    COMMUNICATIONS IN INFORMATION AND SYSTEMS, 2012, 12 (03) : 221 - 232
  • [26] Langevin dynamics for adaptive inverse reinforcement learning of stochastic gradient algorithms
    Krishnamurthy, Vikram
    Yin, George
    Journal of Machine Learning Research, 2021, 22
  • [27] Langevin Dynamics for Adaptive Inverse Reinforcement Learning of Stochastic Gradient Algorithms
    Krishnamurthy, Vikram
    Yin, George
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22 : 1 - 49
  • [28] LOW-PRECISION FORMULAS FOR PLANETARY POSITIONS
    VANFLANDERN, TC
    PULKKINEN, KF
    ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 1979, 41 (03): : 391 - 411
  • [29] Synthesizing Efficient Low-Precision Kernels
    Izycheva, Anastasiia
    Darulova, Eva
    Seidl, Helmut
    AUTOMATED TECHNOLOGY FOR VERIFICATION AND ANALYSIS (ATVA 2019), 2019, 11781 : 294 - 313
  • [30] SGLB: Stochastic Gradient Langevin Boosting
    Ustimenko, Aleksei
    Prokhorenkova, Liudmila
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7502 - 7511