Stochastic Conservative Contextual Linear Bandits

被引:0
|
作者
Lin, Jiabin [1 ]
Lee, Xian Yeow [2 ]
Jubery, Talukder [2 ]
Moothedath, Shana [1 ]
Sarkar, Soumik [2 ]
Ganapathysubramanian, Baskar [2 ]
机构
[1] Iowa State Univ, Dept Elect & Comp Engn, Ames, IA 50011 USA
[2] Iowa State Univ, Dept Mech Engn, Ames, IA USA
关键词
D O I
10.1109/CDC51059.2022.9993209
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we formulate a conservative stochastic contextual bandit formulation for real-time decision making when an adversary chooses a distribution on the set of possible contexts and the learner is subject to certain safety/performance constraints. The learner observes only the context distribution and the exact context is unknown, for instance when the context itself is a noisy measurement or a forecasting mechanism, and the goal is to develop an algorithm that selects a sequence of optimal actions to maximize the cumulative reward without violating the safety constraints at any time step. By leveraging the Upper Confidence Bound (UCB) algorithm for this setting, we propose a conservative linear UCB algorithm for stochastic bandits with context distribution. We prove an upper bound on the regret of the algorithm and show that it can be decomposed into three terms: (i) an upper bound for the regret of the standard linear UCB algorithm, (ii) a constant term (independent of time horizon) that accounts for the loss of being conservative in order to satisfy the safety constraint, and (iii) a constant term (independent of time horizon) that accounts for the loss for the contexts being unknown and only the distribution is known. To validate the performance of our approach we perform numerical simulations on synthetic data and on real-world maize data collected through the Genomes to Fields (G2F) initiative.
引用
收藏
页码:7321 / 7326
页数:6
相关论文
共 50 条
  • [1] Conservative Contextual Linear Bandits
    Kazerouni, Abbas
    Ghavamzadeh, Mohammad
    Abbasi-Yadkori, Yasin
    Van Roy, Benjamin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [2] Stochastic Linear Contextual Bandits with Diverse Contexts
    Wu, Weiqiang
    Yang, Jing
    Shen, Cong
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
  • [3] Design of Experiments for Stochastic Contextual Linear Bandits
    Zanette, Andrea
    Dong, Kefan
    Lee, Jonathan
    Brunskill, Emma
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [4] Learning in Generalized Linear Contextual Bandits with Stochastic Delays
    Zhou, Zhengyuan
    Xu, Renyuan
    Blanchet, Jose
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [5] Stochastic Contextual Dueling Bandits under Linear Stochastic Transitivity Models
    Bengs, Viktor
    Saha, Aadirupa
    Huellermeier, Eyke
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [6] Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks
    Ding, Qin
    Hsieh, Cho-Jui
    Sharpnack, James
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [7] Nonparametric Stochastic Contextual Bandits
    Guan, Melody Y.
    Jiang, Heinrich
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3119 - 3125
  • [8] Contextual Bandits with Stochastic Experts
    Sen, Rajat
    Shanmugam, Karthikeyan
    Shakkottai, Sanjay
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [9] Balanced Linear Contextual Bandits
    Dimakopoulou, Maria
    Zhou, Zhengyuan
    Athey, Susan
    Imbens, Guido
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3445 - 3453
  • [10] Linear Contextual Bandits with Knapsacks
    Agrawal, Shipra
    Devanur, Nikhil R.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29