Recurrent SubmodularWelfare and Matroid Blocking Semi-Bandits

被引:0
|
作者
Papadigenopoulos, Orestis [1 ]
Caramanis, Constantine [2 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
[2] Univ Texas Austin, Elect & Comp Engn, Austin, TX 78712 USA
关键词
COMPLEXITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A recent line of research focuses on the study of stochastic multi-armed bandits (MAB), in the case where temporal correlations of specific structure are imposed between the player's actions and the reward distributions of the arms. These correlations lead to (sub-)optimal solutions that exhibit interesting dynamical patterns - a phenomenon that yields new challenges both from an algorithmic as well as a learning perspective. In this work, we extend the above direction to a combinatorial semi-bandit setting and study a variant of stochastic MAB, where arms are subject to matroid constraints and each arm becomes unavailable (blocked) for a fixed number of rounds after each play. A natural common generalization of the state-of-the-art for blocking bandits, and that for matroid bandits, only guarantees a 1/2-approximation for general matroids. In this paper we develop the novel technique of correlated (interleaved) scheduling, which allows us to obtain a polynomial-time (1 - (1)/(e))-approximation algorithm (asymptotically and in expectation) for any matroid. Along the way, we discover an interesting connection to a variant of Submodular Welfare Maximization, for which we provide (asymptotically) matching upper and lower approximability bounds. In the case where the mean arm rewards are unknown, our technique naturally decouples the scheduling from the learning problem, and thus allows to control the (1 - (1)/(e))-approximate regret of a UCB-based adaptation of our online algorithm.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Improving Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms and Its Applications
    Wang, Qinshi
    Chen, Wei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [22] Closing the Computational-Statistical Gap in Best Arm Identification for Combinatorial Semi-bandits
    Tzeng, Ruo-Chun
    Wang, Po-An
    Proutiere, Alexandre
    Lu, Chi-Jen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [23] Near-Optimal Regret Bounds for Contextual Combinatorial Semi-Bandits with Linear Payoff Functions
    Takemura, Kei
    Ito, Shinji
    Hatano, Daisuke
    Sumita, Hanna
    Fukunaga, Takuro
    Kakimura, Naonori
    Kawarabayashi, Ken-ichi
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9791 - 9798
  • [24] Batch-Size Independent Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms or Independent Arms
    Liu, Xutong
    Zuo, Jinhang
    Wang, Siwei
    Joe-Wong, Carlee
    Lui, John C. S.
    Chen, Wei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [25] Blocking Bandits
    Basu, Soumya
    Sen, Rajat
    Sanghavi, Sujay
    Shakkottai, Sanjay
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [26] Delay-Aware Service Caching in Edge Cloud: An Adversarial Semi-Bandits Learning-based Approach
    Li, Jinpeng
    Xia, Yunni
    Sun, Xiaoning
    Chen, Peng
    Li, Xiaobo
    Feng, Jiafeng
    2024 IEEE 17TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, CLOUD 2024, 2024, : 411 - 418
  • [27] A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits
    Zhou, Huozhi
    Wang, Lingda
    Varshney, Lav R.
    Lim, Ee-Peng
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6933 - 6940
  • [28] Matroid Bandits: Fast Combinatorial Optimization with Learning
    Kveton, Branislav
    Wen, Zheng
    Ashkan, Azin
    Eydgahi, Hoda
    Eriksson, Brian
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2014, : 420 - 429
  • [29] Contextual Blocking Bandits
    Basu, Soumya
    Papadigenopoulos, Orestis
    Caramanis, Constantine
    Shakkottai, Sanjay
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130 : 271 - +
  • [30] Adversarial Blocking Bandits
    Bishop, Nicholas
    Chan, Hau
    Mandal, Debmalya
    Tran-Thanh, Long
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33