Efficient and generalizable tuning strategies for stochastic gradient MCMC

被引:3
|
作者
Coullon, Jeremie [1 ]
South, Leah [2 ]
Nemeth, Christopher [3 ]
机构
[1] Papercup Technol Ltd, London, England
[2] Queensland Univ Technol, Ctr Data Sci, Sch Math Sci, Brisbane, Australia
[3] Univ Lancaster, Math & Stat, Lancaster, Lancashire, England
基金
英国工程与自然科学研究理事会;
关键词
Stochastic gradient; Stein discrepancy; Markov chain Monte Carlo; Hyperparameter optimization;
D O I
10.1007/s11222-023-10233-3
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Stochastic gradient Markov chain Monte Carlo (SGMCMC) is a popular class of algorithms for scalable Bayesian inference. However, these algorithms include hyperparameters such as step size or batch size that influence the accuracy of estimators based on the obtained posterior samples. As a result, these hyperparameters must be tuned by the practitioner and currently no principled and automated way to tune them exists. Standard Markov chain Monte Carlo tuning methods based on acceptance rates cannot be used for SGMCMC, thus requiring alternative tools and diagnostics. We propose a novel bandit-based algorithm that tunes the SGMCMC hyperparameters by minimizing the Stein discrepancy between the true posterior and its Monte Carlo approximation. We provide theoretical results supporting this approach and assess various Stein-based discrepancies. We support our results with experiments on both simulated and real datasets, and find that this method is practical for a wide range of applications.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] An adaptive Hessian approximated stochastic gradient MCMC method
    Wang, Yating
    Deng, Wei
    Lin, Guang
    JOURNAL OF COMPUTATIONAL PHYSICS, 2021, 432
  • [22] CPSG-MCMC: Clustering-Based Preprocessing method for Stochastic Gradient MCMC
    Fu, Tianfan
    Zhang, Zhihua
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54, 2017, 54 : 841 - 850
  • [23] Learning Weight Uncertainty with Stochastic Gradient MCMC for Shape Classification
    Li, Chunyuan
    Stevens, Andrew
    Chen, Changyou
    Pu, Yunchen
    Gan, Zhe
    Carin, Lawrence
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 5666 - 5675
  • [24] Learning Deep Generative Models With Doubly Stochastic Gradient MCMC
    Du, Chao
    Zhu, Jun
    Zhang, Bo
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) : 3084 - 3096
  • [25] Bayesian sparse learning with preconditioned stochastic gradient MCMC and its applications
    Wang, Yating
    Deng, Wei
    Lin, Guang
    JOURNAL OF COMPUTATIONAL PHYSICS, 2021, 432
  • [26] On the Convergence of Stochastic Gradient MCMC Algorithms with High-Order Integrators
    Chen, Changyou
    Ding, Nan
    Carin, Lawrence
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [27] A convergence analysis for a class of practical variance-reduction stochastic gradient MCMC
    Changyou CHEN
    Wenlin WANG
    Yizhe ZHANG
    Qinliang SU
    Lawrence CARIN
    Science China(Information Sciences), 2019, 62 (01) : 67 - 79
  • [28] Non-convex Learning via Replica Exchange Stochastic Gradient MCMC
    Deng, Wei
    Feng, Qi
    Gao, Liyao
    Liang, Faming
    Lin, Guang
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [29] Non-convex Learning via Replica Exchange Stochastic Gradient MCMC
    Deng, Wei
    Feng, Qi
    Gao, Liyao
    Liang, Faming
    Lin, Guang
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [30] A convergence analysis for a class of practical variance-reduction stochastic gradient MCMC
    Changyou Chen
    Wenlin Wang
    Yizhe Zhang
    Qinliang Su
    Lawrence Carin
    Science China Information Sciences, 2019, 62