Stacked Thompson Bandits

被引：3

作者：

Belzner, Lenz ^{[1
]}

Gabor, Thomas ^{[1
]}

机构：

[1] Ludwig Maximilians Univ Munchen, Inst Informat, Munich, Germany

来源：

2017 IEEE/ACM 3RD INTERNATIONAL WORKSHOP ON SOFTWARE ENGINEERING FOR SMART CYBER-PHYSICAL SYSTEMS (SESCPS 2017) | 2017年

关键词：

D O I：

10.1109/SEsCPS.2017.4

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

We introduce Stacked Thompson Bandits (STB) for efficiently generating plans that are likely to satisfy a given bounded temporal logic requirement. STB uses a simulation for evaluation of plans, and takes a Bayesian approach to using the resulting information to guide its search. In particular, we show that stacking multiarmed bandits and using Thompson sampling to guide the action selection process for each bandit enables STB to generate plans that satisfy requirements with a high probability while only searching a fraction of the search space.

引用

页码：18 / 21

页数：4

共 50 条

[1] A Thompson Sampling Algorithm for Cascading Bandits
Cheung, Wang Chi
Tan, Vincent Y. F.
Zhong, Zixin
22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89 : 438 - 447
[2] Thompson Sampling for Linearly Constrained Bandits
Saxena, Vidit
Gonzalez, Joseph E.
Jalden, Joakim
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
[3] Double Thompson Sampling for Dueling Bandits
Wu, Huasen
Liu, Xin
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[4] On the Performance of Thompson Sampling on Logistic Bandits
Dong, Shi
Ma, Tengyu
Van Roy, Benjamin
CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
[5] Thompson Sampling Algorithms for Cascading Bandits
Zhong, Zixin
Chueng, Wang Chi
Tan, Vincent Y. F.
JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
[6] Thompson Sampling on Symmetric α-Stable Bandits
Dubey, Abhimanyu
Pentland, Alex Sandy
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 5715 - 5721
[7] Thompson Sampling for Bandits with Clustered Arms
Carlsson, Emil
Dubhashi, Devdatt
Johansson, Fredrik D.
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2212 - 2218
[8] Thompson Sampling for Combinatorial Semi-Bandits
Wang, Siwei
Chen, Wei
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[9] Thompson Sampling for Multinomial Logit Contextual Bandits
Oh, Min-hwan
Iyengar, Garud
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[10] Thompson Sampling for Stochastic Bandits with Graph Feedback
Tossou, Aristide C. Y.
Dimitrakakis, Christos
Dubhashi, Devdatt
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2660 - 2666

← 1 2 3 4 5 →