Stacked Thompson Bandits

被引:3
|
作者
Belzner, Lenz [1 ]
Gabor, Thomas [1 ]
机构
[1] Ludwig Maximilians Univ Munchen, Inst Informat, Munich, Germany
关键词
D O I
10.1109/SEsCPS.2017.4
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We introduce Stacked Thompson Bandits (STB) for efficiently generating plans that are likely to satisfy a given bounded temporal logic requirement. STB uses a simulation for evaluation of plans, and takes a Bayesian approach to using the resulting information to guide its search. In particular, we show that stacking multiarmed bandits and using Thompson sampling to guide the action selection process for each bandit enables STB to generate plans that satisfy requirements with a high probability while only searching a fraction of the search space.
引用
收藏
页码:18 / 21
页数:4
相关论文
共 50 条
  • [41] 'BANDITS, BANDITS' - GILLIAN,T
    CARRERE, E
    POSITIF, 1982, (254-): : 165 - 166
  • [42] Roving bandits and stationary bandits
    Lee, S
    FORBES, 1998, 161 (09): : 149 - +
  • [43] 'BANDITS, BANDITS' - GILLIAM,T
    CHION, M
    CAHIERS DU CINEMA, 1982, (336): : 50 - 51
  • [44] Kolmogorov-Smirnov Test-Based Actively-Adaptive Thompson Sampling for Non-Stationary Bandits
    Ghatak G.
    Mohanty H.
    Rahman A.U.
    IEEE Transactions on Artificial Intelligence, 2022, 3 (01): : 11 - 19
  • [45] Reducing Dueling Bandits to Cardinal Bandits
    Ailon, Nir
    Karnin, Zohar
    Joachims, Thorsten
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 856 - 864
  • [46] Bandits
    不详
    HISTOIRE, 1999, (233): : 86 - 86
  • [47] BANDITS
    BILLACOI.F
    ANNALES-ECONOMIES SOCIETES CIVILISATIONS, 1973, 28 (05): : 1160 - 1162
  • [48] 'Bandits'
    Bourget, JL
    POSITIF, 2002, (491): : 45 - 46
  • [49] 'Bandits'
    Marini, F
    CINEFORUM, 2002, 42 (01): : 56 - 56
  • [50] 'Bandits'
    Kermode, M
    SIGHT AND SOUND, 2001, 11 (12): : 42 - 43