Stacked Thompson Bandits

被引:3
|
作者
Belzner, Lenz [1 ]
Gabor, Thomas [1 ]
机构
[1] Ludwig Maximilians Univ Munchen, Inst Informat, Munich, Germany
关键词
D O I
10.1109/SEsCPS.2017.4
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We introduce Stacked Thompson Bandits (STB) for efficiently generating plans that are likely to satisfy a given bounded temporal logic requirement. STB uses a simulation for evaluation of plans, and takes a Bayesian approach to using the resulting information to guide its search. In particular, we show that stacking multiarmed bandits and using Thompson sampling to guide the action selection process for each bandit enables STB to generate plans that satisfy requirements with a high probability while only searching a fraction of the search space.
引用
收藏
页码:18 / 21
页数:4
相关论文
共 50 条
  • [1] A Thompson Sampling Algorithm for Cascading Bandits
    Cheung, Wang Chi
    Tan, Vincent Y. F.
    Zhong, Zixin
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89 : 438 - 447
  • [2] Thompson Sampling for Linearly Constrained Bandits
    Saxena, Vidit
    Gonzalez, Joseph E.
    Jalden, Joakim
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
  • [3] Double Thompson Sampling for Dueling Bandits
    Wu, Huasen
    Liu, Xin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [4] On the Performance of Thompson Sampling on Logistic Bandits
    Dong, Shi
    Ma, Tengyu
    Van Roy, Benjamin
    CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [5] Thompson Sampling Algorithms for Cascading Bandits
    Zhong, Zixin
    Chueng, Wang Chi
    Tan, Vincent Y. F.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [6] Thompson Sampling on Symmetric α-Stable Bandits
    Dubey, Abhimanyu
    Pentland, Alex Sandy
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 5715 - 5721
  • [7] Thompson Sampling for Bandits with Clustered Arms
    Carlsson, Emil
    Dubhashi, Devdatt
    Johansson, Fredrik D.
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2212 - 2218
  • [8] Thompson Sampling for Combinatorial Semi-Bandits
    Wang, Siwei
    Chen, Wei
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [9] Thompson Sampling for Multinomial Logit Contextual Bandits
    Oh, Min-hwan
    Iyengar, Garud
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [10] Thompson Sampling for Stochastic Bandits with Graph Feedback
    Tossou, Aristide C. Y.
    Dimitrakakis, Christos
    Dubhashi, Devdatt
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2660 - 2666