Recurrent SubmodularWelfare and Matroid Blocking Semi-Bandits

被引:0
|
作者
Papadigenopoulos, Orestis [1 ]
Caramanis, Constantine [2 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
[2] Univ Texas Austin, Elect & Comp Engn, Austin, TX 78712 USA
关键词
COMPLEXITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A recent line of research focuses on the study of stochastic multi-armed bandits (MAB), in the case where temporal correlations of specific structure are imposed between the player's actions and the reward distributions of the arms. These correlations lead to (sub-)optimal solutions that exhibit interesting dynamical patterns - a phenomenon that yields new challenges both from an algorithmic as well as a learning perspective. In this work, we extend the above direction to a combinatorial semi-bandit setting and study a variant of stochastic MAB, where arms are subject to matroid constraints and each arm becomes unavailable (blocked) for a fixed number of rounds after each play. A natural common generalization of the state-of-the-art for blocking bandits, and that for matroid bandits, only guarantees a 1/2-approximation for general matroids. In this paper we develop the novel technique of correlated (interleaved) scheduling, which allows us to obtain a polynomial-time (1 - (1)/(e))-approximation algorithm (asymptotically and in expectation) for any matroid. Along the way, we discover an interesting connection to a variant of Submodular Welfare Maximization, for which we provide (asymptotically) matching upper and lower approximability bounds. In the case where the mean arm rewards are unknown, our technique naturally decouples the scheduling from the learning problem, and thus allows to control the (1 - (1)/(e))-approximate regret of a UCB-based adaptation of our online algorithm.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Exploiting Structure of Uncertainty for Efficient Matroid Semi-Bandits
    Perrault, Pierre
    Perchet, Vianney
    Valko, Michal
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [2] Matching with semi-bandits
    Kasy, Maximilian
    Teytelboym, Alexander
    ECONOMETRICS JOURNAL, 2023, 26 (01): : 45 - 66
  • [3] Combinatorial Semi-Bandits with Knapsacks
    Sankararaman, Karthik Abinav
    Slivkins, Aleksandrs
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [4] An Efficient Algorithm for Cooperative Semi-Bandits
    Della Vecchia, Riccardo
    Cesari, Tommaso R.
    ALGORITHMIC LEARNING THEORY, VOL 132, 2021, 132
  • [5] Thompson Sampling for Combinatorial Semi-Bandits
    Wang, Siwei
    Chen, Wei
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [6] (Locally) Differentially Private Combinatorial Semi-Bandits
    Chen, Xiaoyu
    Zheng, Kai
    Zhou, Zixin
    Yang, Yunchang
    Chen, Wei
    Wang, Liwei
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [7] Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits
    Ito, Shinji
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [8] (Locally) Differentially Private Combinatorial Semi-Bandits
    Chen, Xiaoyu
    Zheng, Kai
    Zhou, Zixin
    Yang, Yunchang
    Chen, Wei
    Wang, Liwei
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [9] Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously
    Zimmert, Julian
    Luo, Haipeng
    Wei, Chen-Yu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [10] Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits
    Perrault, Pierre
    Boursier, Etienne
    Perchet, Vianney
    Valko, Michal
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33