Thompson Sampling on Symmetric α-Stable Bandits

被引:0
|
作者
Dubey, Abhimanyu [1 ]
Pentland, Alex Sandy [1 ]
机构
[1] MIT, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Thompson Sampling provides an efficient technique to introduce prior knowledge in the multi-armed bandit problem, along with providing remarkable empirical performance. In this paper, we revisit the Thompson Sampling algorithm under rewards drawn from symmetric alpha-stable distributions, which are a class of heavy-tailed probability distributions utilized in finance and economics, in problems such as modeling stock prices and human behavior. We present an efficient framework for posterior inference, which leads to two algorithms for Thompson Sampling in this setting. We prove finite-time regret bounds for both algorithms, and demonstrate through a series of experiments the stronger performance of Thompson Sampling in this setting. With our results, we provide an exposition of symmetric alpha-stable distributions in sequential decision-making, and enable sequential Bayesian inference in applications from diverse fields in finance and complex systems that operate on heavy-tailed features.
引用
收藏
页码:5715 / 5721
页数:7
相关论文
共 50 条
  • [41] Modelling with mixture of symmetric stable distributions using Gibbs sampling
    Salas-Gonzalez, Diego
    Kuruoglu, Ercan E.
    Ruiz, Diego P.
    SIGNAL PROCESSING, 2010, 90 (03) : 774 - 783
  • [42] Batched Thompson Sampling
    Kalkanli, Cem
    Ozgur, Ayfer
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [43] A Tutorial on Thompson Sampling
    Russo, Daniel J.
    Van Roy, Benjamin
    Kazerouni, Abbas
    Osband, Ian
    Wen, Zheng
    FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2018, 11 (01): : 1 - 96
  • [44] Spectral Thompson Sampling
    Kocak, Tomas
    Valko, Michal
    Munos, Remi
    Agrawal, Shipra
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1911 - 1917
  • [45] Collaborative Thompson Sampling
    Zhenyu Zhu
    Liusheng Huang
    Hongli Xu
    Mobile Networks and Applications, 2020, 25 : 1351 - 1363
  • [46] Sampling - Thompson,SK
    Rindskopf, D
    JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 1997, 22 (02) : 246 - 246
  • [47] Universal Thompson Sampling
    Faella, Marco
    Sauro, Luigi
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1109 - 1114
  • [48] Collaborative Thompson Sampling
    Zhu, Zhenyu
    Huang, Liusheng
    Xu, Hongli
    MOBILE NETWORKS & APPLICATIONS, 2020, 25 (04): : 1351 - 1363
  • [49] Parallelizing Thompson Sampling
    Karbasi, Amin
    Mirrokni, Vahab
    Shadravan, Mohammad
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [50] Marginal Posterior Sampling for Slate Bandits
    Dimakopoulou, Maria
    Vlassis, Nikos
    Jebara, Tony
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2223 - 2229