Thompson Sampling on Symmetric α-Stable Bandits

被引:0
|
作者
Dubey, Abhimanyu [1 ]
Pentland, Alex Sandy [1 ]
机构
[1] MIT, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Thompson Sampling provides an efficient technique to introduce prior knowledge in the multi-armed bandit problem, along with providing remarkable empirical performance. In this paper, we revisit the Thompson Sampling algorithm under rewards drawn from symmetric alpha-stable distributions, which are a class of heavy-tailed probability distributions utilized in finance and economics, in problems such as modeling stock prices and human behavior. We present an efficient framework for posterior inference, which leads to two algorithms for Thompson Sampling in this setting. We prove finite-time regret bounds for both algorithms, and demonstrate through a series of experiments the stronger performance of Thompson Sampling in this setting. With our results, we provide an exposition of symmetric alpha-stable distributions in sequential decision-making, and enable sequential Bayesian inference in applications from diverse fields in finance and complex systems that operate on heavy-tailed features.
引用
收藏
页码:5715 / 5721
页数:7
相关论文
共 50 条
  • [31] Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning
    Zhang, Tong
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2022, 4 (02): : 834 - 857
  • [32] Stacked Thompson Bandits
    Belzner, Lenz
    Gabor, Thomas
    2017 IEEE/ACM 3RD INTERNATIONAL WORKSHOP ON SOFTWARE ENGINEERING FOR SMART CYBER-PHYSICAL SYSTEMS (SESCPS 2017), 2017, : 18 - 21
  • [33] The Hardness Analysis of Thompson Sampling for Combinatorial Semi-bandits with Greedy Oracle
    Kong, Fang
    Yang, Yueran
    Chen, Wei
    Li, Shuai
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [34] Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits
    Park, Hongju
    Faradonbeh, Mohamad Kazem Shirani
    IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 2150 - 2155
  • [35] A Change-Detection-Based Thompson Sampling Framework for Non-Stationary Bandits
    Ghatak, Gourab
    IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (10) : 1670 - 1676
  • [36] eLifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits
    Neu, Gergely
    Olkhovskaya, Julia
    Papini, Matteo
    Schwartz, Ludovic
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [37] Thompson Sampling for Stochastic Bandits with Noisy Contexts: An Information-Theoretic Regret Analysis
    Jose, Sharu Theresa
    Moothedath, Shana
    ENTROPY, 2024, 26 (07)
  • [38] Near-Optimal Thompson Sampling-based Algorithms for Differentially Private Stochastic Bandits
    Hu, Bingshan
    Hegde, Nidhi
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 844 - +
  • [39] Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits
    Jin, Tianyuan
    Xu, Pan
    Xiao, Xiaokui
    Anandkumar, Anima
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [40] Kolmogorov-Smirnov Test-Based Actively-Adaptive Thompson Sampling for Non-Stationary Bandits
    Ghatak G.
    Mohanty H.
    Rahman A.U.
    IEEE Transactions on Artificial Intelligence, 2022, 3 (01): : 11 - 19