Recurrent SubmodularWelfare and Matroid Blocking Semi-Bandits

被引:0
|
作者
Papadigenopoulos, Orestis [1 ]
Caramanis, Constantine [2 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
[2] Univ Texas Austin, Elect & Comp Engn, Austin, TX 78712 USA
关键词
COMPLEXITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A recent line of research focuses on the study of stochastic multi-armed bandits (MAB), in the case where temporal correlations of specific structure are imposed between the player's actions and the reward distributions of the arms. These correlations lead to (sub-)optimal solutions that exhibit interesting dynamical patterns - a phenomenon that yields new challenges both from an algorithmic as well as a learning perspective. In this work, we extend the above direction to a combinatorial semi-bandit setting and study a variant of stochastic MAB, where arms are subject to matroid constraints and each arm becomes unavailable (blocked) for a fixed number of rounds after each play. A natural common generalization of the state-of-the-art for blocking bandits, and that for matroid bandits, only guarantees a 1/2-approximation for general matroids. In this paper we develop the novel technique of correlated (interleaved) scheduling, which allows us to obtain a polynomial-time (1 - (1)/(e))-approximation algorithm (asymptotically and in expectation) for any matroid. Along the way, we discover an interesting connection to a variant of Submodular Welfare Maximization, for which we provide (asymptotically) matching upper and lower approximability bounds. In the case where the mean arm rewards are unknown, our technique naturally decouples the scheduling from the learning problem, and thus allows to control the (1 - (1)/(e))-approximate regret of a UCB-based adaptation of our online algorithm.
引用
收藏
页数:13
相关论文
共 50 条
  • [11] Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits
    Kveton, Branislav
    Wen, Zheng
    Ashkan, Azin
    Szepesvari, Csaba
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 535 - 543
  • [12] Efficient Learning in Large-Scale Combinatorial Semi-Bandits
    Wen, Zheng
    Kveton, Branislav
    Ashkan, Azin
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 1113 - 1122
  • [13] Asymptotically Optimal Strategies For Combinatorial Semi-Bandits in Polynomial Time
    Cuvelier, Thibaut
    Combes, Richard
    Gourdin, Eric
    ALGORITHMIC LEARNING THEORY, VOL 132, 2021, 132
  • [14] Censored Semi-Bandits: A Framework for Resource Allocation with Censored Feedback
    Verma, Arun
    Hanawal, Manjesh K.
    Rajkumar, Arun
    Sankaran, Raman
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [15] A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
    van der Hoeven, Dirk
    Zierahn, Lukas
    Lancewicki, Tal
    Rosenberg, Aviv
    Cesa-Bianchi, Nicolo
    THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
  • [16] Efficient Ordered Combinatorial Semi-Bandits for Whole-Page Recommendation
    Wang, Yingfei
    Ouyang, Hua
    Wang, Chu
    Chen, Jianhui
    Asamov, Tsvetan
    Chang, Yi
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2746 - 2753
  • [17] An Arm-Wise Randomization Approach to Combinatorial Linear Semi-Bandits
    Takemura, Kei
    Ito, Shinji
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 1318 - 1323
  • [18] The Hardness Analysis of Thompson Sampling for Combinatorial Semi-bandits with Greedy Oracle
    Kong, Fang
    Yang, Yueran
    Chen, Wei
    Li, Shuai
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [19] Statistically Efficient, Polynomial-Time Algorithms for Combinatorial Semi-Bandits
    Cuvelier, Thibaut
    Combes, Richard
    Gourdin, Eric
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2021, 5 (01)
  • [20] Importance Weighting Without Importance Weights: An Efficient Algorithm for Combinatorial Semi-Bandits
    Neu, Gergely
    Bartok, Gabor
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17 : 1 - 21