Stochastic Conservative Contextual Linear Bandits

被引:0
|
作者
Lin, Jiabin [1 ]
Lee, Xian Yeow [2 ]
Jubery, Talukder [2 ]
Moothedath, Shana [1 ]
Sarkar, Soumik [2 ]
Ganapathysubramanian, Baskar [2 ]
机构
[1] Iowa State Univ, Dept Elect & Comp Engn, Ames, IA 50011 USA
[2] Iowa State Univ, Dept Mech Engn, Ames, IA USA
关键词
D O I
10.1109/CDC51059.2022.9993209
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we formulate a conservative stochastic contextual bandit formulation for real-time decision making when an adversary chooses a distribution on the set of possible contexts and the learner is subject to certain safety/performance constraints. The learner observes only the context distribution and the exact context is unknown, for instance when the context itself is a noisy measurement or a forecasting mechanism, and the goal is to develop an algorithm that selects a sequence of optimal actions to maximize the cumulative reward without violating the safety constraints at any time step. By leveraging the Upper Confidence Bound (UCB) algorithm for this setting, we propose a conservative linear UCB algorithm for stochastic bandits with context distribution. We prove an upper bound on the regret of the algorithm and show that it can be decomposed into three terms: (i) an upper bound for the regret of the standard linear UCB algorithm, (ii) a constant term (independent of time horizon) that accounts for the loss of being conservative in order to satisfy the safety constraint, and (iii) a constant term (independent of time horizon) that accounts for the loss for the contexts being unknown and only the distribution is known. To validate the performance of our approach we perform numerical simulations on synthetic data and on real-world maize data collected through the Genomes to Fields (G2F) initiative.
引用
收藏
页码:7321 / 7326
页数:6
相关论文
共 50 条
  • [41] The Impact of Batch Learning in Stochastic Linear Bandits
    Provodin, Danil
    Gajane, Pratik
    Pechenizkiy, Mykola
    Kaptein, Maurits
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 1149 - 1154
  • [42] Efficient and Robust High-Dimensional Linear Contextual Bandits
    Chen, Cheng
    Luo, Luo
    Zhang, Weinan
    Yu, Yong
    Lian, Yijiang
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 4259 - 4265
  • [43] Noise-Adaptive Thompson Sampling for Linear Contextual Bandits
    Xu, Ruitu
    Min, Yifei
    Wang, Tianhao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [44] Sparse Linear Contextual Bandits via Relevance Vector Machines
    Gilton, Davis
    Willett, Rebecca
    2017 INTERNATIONAL CONFERENCE ON SAMPLING THEORY AND APPLICATIONS (SAMPTA), 2017, : 518 - 522
  • [45] Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions
    He, Jiafan
    Zhou, Dongruo
    Zhang, Tong
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [46] Distributed Contextual Linear Bandits with Minimax Optimal Communication Cost
    Amani, Sanae
    Lattimore, Tor
    Gyorgy, Andras
    Yang, Lin F.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202 : 691 - 717
  • [47] Context Enhancement for Linear Contextual Multi-Armed Bandits
    Gutowski, Nicolas
    Amghar, Tassadit
    Camp, Olivier
    Chhel, Fabien
    2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 1048 - 1055
  • [48] Multi-Agent Learning with Heterogeneous Linear Contextual Bandits
    Anh Do
    Thanh Nguyen-Tang
    Arora, Raman
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [49] Delay-Adaptive Learning in Generalized Linear Contextual Bandits
    Blanchet, Jose
    Xu, Renyuan
    Zhou, Zhengyuan
    MATHEMATICS OF OPERATIONS RESEARCH, 2024, 49 (01) : 326 - 345
  • [50] A Parameter-Free Algorithm for Misspecified Linear Contextual Bandits
    Takemura, Kei
    Ito, Shinji
    Hatano, Daisuke
    Sumita, Hanna
    Fukunaga, Takuro
    Kakimura, Naonori
    Kawarabayashi, Ken-ichi
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130