Stochastic Conservative Contextual Linear Bandits

被引：0

作者：

Lin, Jiabin ^{[1
]}

Lee, Xian Yeow ^{[2
]}

Jubery, Talukder ^{[2
]}

Moothedath, Shana ^{[1
]}

Sarkar, Soumik ^{[2
]}

Ganapathysubramanian, Baskar ^{[2
]}

机构：

[1] Iowa State Univ, Dept Elect & Comp Engn, Ames, IA 50011 USA

[2] Iowa State Univ, Dept Mech Engn, Ames, IA USA

来源：

2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC) | 2022年

关键词：

D O I：

10.1109/CDC51059.2022.9993209

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we formulate a conservative stochastic contextual bandit formulation for real-time decision making when an adversary chooses a distribution on the set of possible contexts and the learner is subject to certain safety/performance constraints. The learner observes only the context distribution and the exact context is unknown, for instance when the context itself is a noisy measurement or a forecasting mechanism, and the goal is to develop an algorithm that selects a sequence of optimal actions to maximize the cumulative reward without violating the safety constraints at any time step. By leveraging the Upper Confidence Bound (UCB) algorithm for this setting, we propose a conservative linear UCB algorithm for stochastic bandits with context distribution. We prove an upper bound on the regret of the algorithm and show that it can be decomposed into three terms: (i) an upper bound for the regret of the standard linear UCB algorithm, (ii) a constant term (independent of time horizon) that accounts for the loss of being conservative in order to satisfy the safety constraint, and (iii) a constant term (independent of time horizon) that accounts for the loss for the contexts being unknown and only the distribution is known. To validate the performance of our approach we perform numerical simulations on synthetic data and on real-world maize data collected through the Genomes to Fields (G2F) initiative.

引用

页码：7321 / 7326

页数：6

共 50 条

[41] The Impact of Batch Learning in Stochastic Linear Bandits
Provodin, Danil
Gajane, Pratik
Pechenizkiy, Mykola
Kaptein, Maurits
2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 1149 - 1154
[42] Efficient and Robust High-Dimensional Linear Contextual Bandits
Chen, Cheng
Luo, Luo
Zhang, Weinan
Yu, Yong
Lian, Yijiang
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 4259 - 4265
[43] Noise-Adaptive Thompson Sampling for Linear Contextual Bandits
Xu, Ruitu
Min, Yifei
Wang, Tianhao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[44] Sparse Linear Contextual Bandits via Relevance Vector Machines
Gilton, Davis
Willett, Rebecca
2017 INTERNATIONAL CONFERENCE ON SAMPLING THEORY AND APPLICATIONS (SAMPTA), 2017, : 518 - 522
[45] Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions
He, Jiafan
Zhou, Dongruo
Zhang, Tong
Gu, Quanquan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[46] Distributed Contextual Linear Bandits with Minimax Optimal Communication Cost
Amani, Sanae
Lattimore, Tor
Gyorgy, Andras
Yang, Lin F.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202 : 691 - 717
[47] Context Enhancement for Linear Contextual Multi-Armed Bandits
Gutowski, Nicolas
Amghar, Tassadit
Camp, Olivier
Chhel, Fabien
2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 1048 - 1055
[48] Multi-Agent Learning with Heterogeneous Linear Contextual Bandits
Anh Do
Thanh Nguyen-Tang
Arora, Raman
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[49] Delay-Adaptive Learning in Generalized Linear Contextual Bandits
Blanchet, Jose
Xu, Renyuan
Zhou, Zhengyuan
MATHEMATICS OF OPERATIONS RESEARCH, 2024, 49 (01) : 326 - 345
[50] A Parameter-Free Algorithm for Misspecified Linear Contextual Bandits
Takemura, Kei
Ito, Shinji
Hatano, Daisuke
Sumita, Hanna
Fukunaga, Takuro
Kakimura, Naonori
Kawarabayashi, Ken-ichi
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130

← 1 2 3 4 5 →