Stochastic Conservative Contextual Linear Bandits

被引:0
|
作者
Lin, Jiabin [1 ]
Lee, Xian Yeow [2 ]
Jubery, Talukder [2 ]
Moothedath, Shana [1 ]
Sarkar, Soumik [2 ]
Ganapathysubramanian, Baskar [2 ]
机构
[1] Iowa State Univ, Dept Elect & Comp Engn, Ames, IA 50011 USA
[2] Iowa State Univ, Dept Mech Engn, Ames, IA USA
关键词
D O I
10.1109/CDC51059.2022.9993209
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we formulate a conservative stochastic contextual bandit formulation for real-time decision making when an adversary chooses a distribution on the set of possible contexts and the learner is subject to certain safety/performance constraints. The learner observes only the context distribution and the exact context is unknown, for instance when the context itself is a noisy measurement or a forecasting mechanism, and the goal is to develop an algorithm that selects a sequence of optimal actions to maximize the cumulative reward without violating the safety constraints at any time step. By leveraging the Upper Confidence Bound (UCB) algorithm for this setting, we propose a conservative linear UCB algorithm for stochastic bandits with context distribution. We prove an upper bound on the regret of the algorithm and show that it can be decomposed into three terms: (i) an upper bound for the regret of the standard linear UCB algorithm, (ii) a constant term (independent of time horizon) that accounts for the loss of being conservative in order to satisfy the safety constraint, and (iii) a constant term (independent of time horizon) that accounts for the loss for the contexts being unknown and only the distribution is known. To validate the performance of our approach we perform numerical simulations on synthetic data and on real-world maize data collected through the Genomes to Fields (G2F) initiative.
引用
收藏
页码:7321 / 7326
页数:6
相关论文
共 50 条
  • [11] Federated Linear Contextual Bandits
    Huang, Ruiquan
    Wu, Weiqiang
    Yang, Jing
    Shen, Cong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [12] Contexts can be Cheap: Solving Stochastic Contextual Bandits with Linear Bandit Algorithms
    Hanna, Osama A.
    Yang, Lin F.
    Fragouli, Christina
    THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
  • [13] Adversarial Attacks on Linear Contextual Bandits
    Garcelon, Evrard
    Roziere, Baptiste
    Meunier, Laurent
    Tarbouriech, Jean
    Teytaud, Olivier
    Lazaric, Alessandro
    Pirotta, Matteo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33
  • [14] Shuffle Private Linear Contextual Bandits
    Chowdhury, Sayak Ray
    Zhou, Xingyu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [15] Differentially Private Contextual Linear Bandits
    Shariff, Roshan
    Sheffet, Or
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [16] An Efficient Algorithm for Deep Stochastic Contextual Bandits
    Zhu, Tan
    Liang, Guannan
    Zhu, Chunjiang
    Li, Haining
    Bi, Jinbo
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 11193 - 11201
  • [17] Optimal Algorithms for Stochastic Contextual Preference Bandits
    Saha, Aadirupa
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [18] Stochastic Contextual Bandits with Long Horizon Rewards
    Qin, Yuzhen
    Li, Yingcong
    Pasqualetti, Fabio
    Fazel, Maryam
    Oymak, Samet
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9525 - 9533
  • [19] Safe Linear Stochastic Bandits
    Khezeli, Kia
    Bitar, Eilyan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 10202 - 10209
  • [20] Stochastic Bandits with Linear Constraints
    Pacchiano, Aldo
    Ghavamzadeh, Mohammad
    Bartlett, Peter
    Jiang, Heinrich
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130