Efficient and generalizable tuning strategies for stochastic gradient MCMC

被引:3
|
作者
Coullon, Jeremie [1 ]
South, Leah [2 ]
Nemeth, Christopher [3 ]
机构
[1] Papercup Technol Ltd, London, England
[2] Queensland Univ Technol, Ctr Data Sci, Sch Math Sci, Brisbane, Australia
[3] Univ Lancaster, Math & Stat, Lancaster, Lancashire, England
基金
英国工程与自然科学研究理事会;
关键词
Stochastic gradient; Stein discrepancy; Markov chain Monte Carlo; Hyperparameter optimization;
D O I
10.1007/s11222-023-10233-3
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Stochastic gradient Markov chain Monte Carlo (SGMCMC) is a popular class of algorithms for scalable Bayesian inference. However, these algorithms include hyperparameters such as step size or batch size that influence the accuracy of estimators based on the obtained posterior samples. As a result, these hyperparameters must be tuned by the practitioner and currently no principled and automated way to tune them exists. Standard Markov chain Monte Carlo tuning methods based on acceptance rates cannot be used for SGMCMC, thus requiring alternative tools and diagnostics. We propose a novel bandit-based algorithm that tunes the SGMCMC hyperparameters by minimizing the Stein discrepancy between the true posterior and its Monte Carlo approximation. We provide theoretical results supporting this approach and assess various Stein-based discrepancies. We support our results with experiments on both simulated and real datasets, and find that this method is practical for a wide range of applications.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Stochastic gradient estimation strategies for Markov random fields
    Younes, L
    BAYESIAN INFERENCE FOR INVERSE PROBLEMS, 1998, 3459 : 315 - 325
  • [42] Auto-Tuning Structured Light by Optical Stochastic Gradient Descent
    Chen, Wenzheng
    Mirdehghan, Parsa
    Fidler, Sanja
    Kutulakos, Kiriakos N.
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5969 - 5979
  • [43] Quantifying model uncertainty for semantic segmentation of Fluorine-19 MRI using stochastic gradient MCMC
    Javanbakhat, Masoumeh
    Starke, Ludger
    Waiczies, Sonia
    Lippert, Christoph
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 241
  • [44] IMPROVING SAMPLING ACCURACY OF STOCHASTIC GRADIENT MCMC METHODS VIA NON-UNIFORM SUBSAMPLING OF GRADIENTS
    Li, Ruilin
    Wang, Xin
    Zha, Hongyuan
    Tao, Molei
    DISCRETE AND CONTINUOUS DYNAMICAL SYSTEMS, 2021,
  • [45] IMPROVING SAMPLING ACCURACY OF STOCHASTIC GRADIENT MCMC METHODS VIA NON-UNIFORM SUBSAMPLING OF GRADIENTS
    Li, Ruilin
    Wang, Xin
    Zha, Hongyuan
    Tao, Molei
    DISCRETE AND CONTINUOUS DYNAMICAL SYSTEMS-SERIES S, 2023, 16 (02): : 329 - 360
  • [46] Domain-Oriented Prefix-Tuning: Towards Efficient and Generalizable Fine-tuning for Zero-Shot Dialogue Summarization
    Zhao, Lulu
    Zheng, Fujia
    Zeng, Weihao
    He, Keqing
    Xu, Weiran
    Jiang, Huixing
    Wu, Wei
    Wu, Yanan
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 4848 - 4862
  • [47] Efficient receiver tuning using differential evolution strategies
    Wheeler, Caleb H.
    Toland, Trevor G.
    SOFTWARE AND CYBERINFRASTRUCTURE FOR ASTRONOMY IV, 2016, 9913
  • [48] Stochastic Bridges as Effective Regularizers for Parameter-Efficient Tuning
    Chen, Weize
    Han, Xu
    Lin, Yankai
    Liu, Zhiyuan
    Sun, Maosong
    Zhou, Jie
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 10400 - 10420
  • [49] DRAM: Efficient adaptive MCMC
    Heikki Haario
    Marko Laine
    Antonietta Mira
    Eero Saksman
    Statistics and Computing, 2006, 16 : 339 - 354
  • [50] DRAM: Efficient adaptive MCMC
    Haario, Heikki
    Laine, Marko
    Mira, Antonietta
    Saksman, Eero
    STATISTICS AND COMPUTING, 2006, 16 (04) : 339 - 354