Efficient and generalizable tuning strategies for stochastic gradient MCMC

被引:3
|
作者
Coullon, Jeremie [1 ]
South, Leah [2 ]
Nemeth, Christopher [3 ]
机构
[1] Papercup Technol Ltd, London, England
[2] Queensland Univ Technol, Ctr Data Sci, Sch Math Sci, Brisbane, Australia
[3] Univ Lancaster, Math & Stat, Lancaster, Lancashire, England
基金
英国工程与自然科学研究理事会;
关键词
Stochastic gradient; Stein discrepancy; Markov chain Monte Carlo; Hyperparameter optimization;
D O I
10.1007/s11222-023-10233-3
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Stochastic gradient Markov chain Monte Carlo (SGMCMC) is a popular class of algorithms for scalable Bayesian inference. However, these algorithms include hyperparameters such as step size or batch size that influence the accuracy of estimators based on the obtained posterior samples. As a result, these hyperparameters must be tuned by the practitioner and currently no principled and automated way to tune them exists. Standard Markov chain Monte Carlo tuning methods based on acceptance rates cannot be used for SGMCMC, thus requiring alternative tools and diagnostics. We propose a novel bandit-based algorithm that tunes the SGMCMC hyperparameters by minimizing the Stein discrepancy between the true posterior and its Monte Carlo approximation. We provide theoretical results supporting this approach and assess various Stein-based discrepancies. We support our results with experiments on both simulated and real datasets, and find that this method is practical for a wide range of applications.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Efficient and generalizable tuning strategies for stochastic gradient MCMC
    Jeremie Coullon
    Leah South
    Christopher Nemeth
    Statistics and Computing, 2023, 33 (3)
  • [2] AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC
    Zhang, Ruqi
    Cooper, A. Feder
    De Sa, Christopher
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2142 - 2151
  • [3] Communication-Efficient Stochastic Gradient MCMC for Neural Networks
    Li, Chunyuan
    Chen, Changyou
    Pu, Yunchen
    Henao, Ricardo
    Carin, Lawrence
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4173 - 4180
  • [4] Distributed Stochastic Gradient MCMC
    Ahn, Sungjin
    Shahbaba, Babak
    Welling, Max
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 1044 - 1052
  • [5] Structured Stochastic Gradient MCMC
    Alexos, Antonios
    Boyd, Alex
    Mandt, Stephan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 414 - 434
  • [6] STOCHASTIC THERMODYNAMIC INTEGRATION: EFFICIENT BAYESIAN MODEL SELECTION VIA STOCHASTIC GRADIENT MCMC
    Simsekli, Umut
    Badeau, Roland
    Richard, Gael
    Cemgil, Ali Taylan
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 2574 - 2578
  • [7] A Complete Recipe for Stochastic Gradient MCMC
    Ma, Yi-An
    Chen, Tianqi
    Fox, Emily B.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [8] Control variates for stochastic gradient MCMC
    Baker, Jack
    Fearnhead, Paul
    Fox, Emily B.
    Nemeth, Christopher
    STATISTICS AND COMPUTING, 2019, 29 (03) : 599 - 615
  • [9] Control variates for stochastic gradient MCMC
    Jack Baker
    Paul Fearnhead
    Emily B. Fox
    Christopher Nemeth
    Statistics and Computing, 2019, 29 : 599 - 615
  • [10] Stochastic Gradient MCMC with Stale Gradients
    Chen, Changyou
    Ding, Nan
    Li, Chunyuan
    Zhang, Yizhe
    Carin, Lawrence
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29