Thompson Sampling on Symmetric α-Stable Bandits

被引:0
|
作者
Dubey, Abhimanyu [1 ]
Pentland, Alex Sandy [1 ]
机构
[1] MIT, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Thompson Sampling provides an efficient technique to introduce prior knowledge in the multi-armed bandit problem, along with providing remarkable empirical performance. In this paper, we revisit the Thompson Sampling algorithm under rewards drawn from symmetric alpha-stable distributions, which are a class of heavy-tailed probability distributions utilized in finance and economics, in problems such as modeling stock prices and human behavior. We present an efficient framework for posterior inference, which leads to two algorithms for Thompson Sampling in this setting. We prove finite-time regret bounds for both algorithms, and demonstrate through a series of experiments the stronger performance of Thompson Sampling in this setting. With our results, we provide an exposition of symmetric alpha-stable distributions in sequential decision-making, and enable sequential Bayesian inference in applications from diverse fields in finance and complex systems that operate on heavy-tailed features.
引用
收藏
页码:5715 / 5721
页数:7
相关论文
共 50 条
  • [1] A Thompson Sampling Algorithm for Cascading Bandits
    Cheung, Wang Chi
    Tan, Vincent Y. F.
    Zhong, Zixin
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89 : 438 - 447
  • [2] Thompson Sampling for Linearly Constrained Bandits
    Saxena, Vidit
    Gonzalez, Joseph E.
    Jalden, Joakim
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
  • [3] Double Thompson Sampling for Dueling Bandits
    Wu, Huasen
    Liu, Xin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [4] On the Performance of Thompson Sampling on Logistic Bandits
    Dong, Shi
    Ma, Tengyu
    Van Roy, Benjamin
    CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [5] Thompson Sampling Algorithms for Cascading Bandits
    Zhong, Zixin
    Chueng, Wang Chi
    Tan, Vincent Y. F.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [6] Thompson Sampling for Bandits with Clustered Arms
    Carlsson, Emil
    Dubhashi, Devdatt
    Johansson, Fredrik D.
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2212 - 2218
  • [7] Thompson Sampling for Combinatorial Semi-Bandits
    Wang, Siwei
    Chen, Wei
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [8] Thompson Sampling for Multinomial Logit Contextual Bandits
    Oh, Min-hwan
    Iyengar, Garud
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [9] Thompson Sampling for Stochastic Bandits with Graph Feedback
    Tossou, Aristide C. Y.
    Dimitrakakis, Christos
    Dubhashi, Devdatt
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2660 - 2666
  • [10] Variational Thompson Sampling for Relational Recurrent Bandits
    Lamprier, Sylvain
    Gisselbrecht, Thibault
    Gallinari, Patrick
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT II, 2017, 10535 : 405 - 421