Low-Precision Stochastic Gradient Langevin Dynamics

被引：0

作者：

Zhang, Ruqi ^{[1
]}

Wilson, Andrew Gordon ^{[2
]}

De Sa, Christopher ^{[3
]}

机构：

[1] Univ Texas Austin, Austin, TX 78712 USA

[2] NYU, New York, NY 10012 USA

[3] Cornell Univ, Ithaca, NY 14853 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

While low-precision optimization has been widely used to accelerate deep learning, low-precision sampling remains largely unexplored. As a consequence, sampling is simply infeasible in many large-scale scenarios, despite providing remarkable benefits to generalization and uncertainty estimation for neural networks. In this paper, we provide the first study of low-precision Stochastic Gradient Langevin Dynamics (SGLD), showing that its costs can be significantly reduced without sacrificing performance, due to its intrinsic ability to handle system noise. We prove that the convergence of low-precision SGLD with full-precision gradient accumulators is less affected by the quantization error than its SGD counterpart in the strongly convex setting. To further enable low-precision gradient accumulators, we develop a new quantization function for SGLD that preserves the variance in each update step. We demonstrate that low-precision SGLD achieves comparable performance to full-precision SGLD with only 8 bits on a variety of deep learning tasks.

引用

页数：21

共 50 条

[21] Approximation to Stochastic Variance Reduced Gradient Langevin Dynamics by Stochastic Delay Differential Equations
Chen, Peng
Lu, Jianya
Xu, Lihu
APPLIED MATHEMATICS AND OPTIMIZATION, 2022, 85 (02):
[22] Approximation to Stochastic Variance Reduced Gradient Langevin Dynamics by Stochastic Delay Differential Equations
Peng Chen
Jianya Lu
Lihu Xu
Applied Mathematics & Optimization, 2022, 85
[23] On stochastic gradient Langevin dynamics with dependent data streams in the logconcave case
Barkhagen, M.
Chau, N. H.
Moulines, E.
Rasonyi, M.
Sabanis, S.
Zhang, Y.
BERNOULLI, 2021, 27 (01) : 1 - 33
[24] Nonasymptotic Estimation of Risk Measures Using Stochastic Gradient Langevin Dynamics
Chu, Jiarui
Tangpi, Ludovic
SIAM JOURNAL ON FINANCIAL MATHEMATICS, 2024, 15 (02): : 503 - 536
[25] HYBRID DETERMINISTIC-STOCHASTIC GRADIENT LANGEVIN DYNAMICS FOR BAYESIAN LEARNING
He, Qi
Xin, Jack
COMMUNICATIONS IN INFORMATION AND SYSTEMS, 2012, 12 (03) : 221 - 232
[26] Langevin dynamics for adaptive inverse reinforcement learning of stochastic gradient algorithms
Krishnamurthy, Vikram
Yin, George
Journal of Machine Learning Research, 2021, 22
[27] Langevin Dynamics for Adaptive Inverse Reinforcement Learning of Stochastic Gradient Algorithms
Krishnamurthy, Vikram
Yin, George
JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22 : 1 - 49
[28] LOW-PRECISION FORMULAS FOR PLANETARY POSITIONS
VANFLANDERN, TC
PULKKINEN, KF
ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 1979, 41 (03): : 391 - 411
[29] Synthesizing Efficient Low-Precision Kernels
Izycheva, Anastasiia
Darulova, Eva
Seidl, Helmut
AUTOMATED TECHNOLOGY FOR VERIFICATION AND ANALYSIS (ATVA 2019), 2019, 11781 : 294 - 313
[30] SGLB: Stochastic Gradient Langevin Boosting
Ustimenko, Aleksei
Prokhorenkova, Liudmila
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7502 - 7511

← 1 2 3 4 5 →