Stacked Thompson Bandits

被引：3

作者：

Belzner, Lenz ^{[1
]}

Gabor, Thomas ^{[1
]}

机构：

[1] Ludwig Maximilians Univ Munchen, Inst Informat, Munich, Germany

来源：

2017 IEEE/ACM 3RD INTERNATIONAL WORKSHOP ON SOFTWARE ENGINEERING FOR SMART CYBER-PHYSICAL SYSTEMS (SESCPS 2017) | 2017年

关键词：

D O I：

10.1109/SEsCPS.2017.4

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

We introduce Stacked Thompson Bandits (STB) for efficiently generating plans that are likely to satisfy a given bounded temporal logic requirement. STB uses a simulation for evaluation of plans, and takes a Bayesian approach to using the resulting information to guide its search. In particular, we show that stacking multiarmed bandits and using Thompson sampling to guide the action selection process for each bandit enables STB to generate plans that satisfy requirements with a high probability while only searching a fraction of the search space.

引用

页码：18 / 21

页数：4

共 50 条

[21] Thompson Sampling for Robust Transfer in Multi-Task Bandits
Wang, Zhi
Zhang, Chicheng
Chaudhuri, Kamalika
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[22] Optimal Thompson Sampling strategies for support-aware CVaR bandits
Baudry, Dorian
Gautron, Romain
Kaufmann, Emilie
Maillard, Odalric-Ambrym
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[23] Thompson Sampling for High-Dimensional Sparse Linear Contextual Bandits
Chakraborty, Sunrit
Roy, Saptarshi
Tewari, Ambuj
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
[24] Asymptotic Performance of Thompson Sampling for Batched Multi-Armed Bandits
Kalkanli, Cem
Ozgur, Ayfer
IEEE TRANSACTIONS ON INFORMATION THEORY, 2023, 69 (09) : 5956 - 5970
[25] Asymptotic Performance of Thompson Sampling in the Batched Multi-Armed Bandits
Kalkanli, Cem
Ozgur, Ayfer
2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 539 - 544
[26] Thompson sampling for multi-armed bandits in big data environments
Kim, Min Kyong
Hwang, Beom Seuk
KOREAN JOURNAL OF APPLIED STATISTICS, 2024, 37 (05)
[27] Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling
Trinh, Cindy
Kaufmann, Emilie
Vernade, Claire
Combes, Richard
ALGORITHMIC LEARNING THEORY, VOL 117, 2020, 117 : 862 - 889
[28] A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits
Chang, Joel Q. L.
Tan, Vincent Y. F.
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6159 - 6166
[29] PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits
Dumitrascu, Bianca
Feng, Karen
Engelhardt, Barbara E.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[30] Double Doubly Robust Thompson Sampling for Generalized Linear Contextual Bandits
Kim, Wonyoung
Lee, Kyungbok
Paik, Myunghee Cho
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 8300 - 8307

← 1 2 3 4 5 →