Learning from Delayed Semi-Bandit Feedback under Strong Fairness Guarantees

被引:5
|
作者
Steiger, Juaren [1 ]
Li, Bin [2 ]
Lu, Ning [1 ]
机构
[1] Queens Univ, Dept Elect & Comp Engn, Kingston, ON, Canada
[2] Penn State Univ, Sch Elect Engn & Comp Sci, State Coll, PA USA
关键词
D O I
10.1109/INFOCOM48880.2022.9796683
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-armed bandit frameworks, including combinatorial semi-bandits and sleeping bandits, are commonly employed to model problems in communication networks and other engineering domains. In such problems, feedback to the learning agent is often delayed (e.g. communication delays in a wireless network or conversion delays in online advertising). Moreover, arms in a bandit problem often represent entities required to be treated fairly, i.e. the arms should be played at least a required fraction of the time. In contrast to the previously studied asymptotic fairness, many real-time systems require such fairness guarantees to hold even in the short-term (e.g. ensuring the credibility of information flows in an industrial Internet of Things (IoT) system). To that end, we develop the Learning with Delays under Fairness (LDF) algorithm to solve combinatorial semi-bandit problems with sleeping arms and delayed feedback, which we prove guarantees strong (short-term) fairness. While previous theoretical work on bandit problems with delayed feedback typically derive instance-dependent regret bounds, this approach proves to be challenging when simultaneously considering fairness. We instead derive a novel instance-independent regret bound in this setting which agrees with state-of-the-art bounds. We verify our theoretical results with extensive simulations using both synthetic and real-world datasets.
引用
收藏
页码:1379 / 1388
页数:10
相关论文
共 48 条
  • [1] An Efficient Algorithm for Learning with Semi-bandit Feedback
    Neu, Gergely
    Bartok, Gabor
    ALGORITHMIC LEARNING THEORY (ALT 2013), 2013, 8139 : 234 - 248
  • [2] ONLINE LEARNING FOR COMPUTATION PEER OFFLOADING WITH SEMI-BANDIT FEEDBACK
    Zhu, Hongbin
    Kang, Kai
    Luo, Xiliang
    Qian, Hua
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 4524 - 4528
  • [3] Optimal Resource Allocation with Semi-Bandit Feedback
    Lattimore, Tor
    Crammer, Koby
    Szepesvari, Csaba
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2014, : 477 - 486
  • [4] Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback
    Wen, Zheng
    Kveton, Branislav
    Valko, Michal
    Vaswani, Sharan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [5] Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback
    Letard, Alexandre
    Amghar, Tassadit
    Camp, Olivier
    Gutowski, Nicolas
    2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 1073 - 1078
  • [6] Linear Multi-Resource Allocation with Semi-Bandit Feedback
    Lattimore, Tor
    Crammer, Koby
    Szepesvari, Csaba
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [7] Stochastic Online Greedy Learning with Semi-bandit Feedbacks
    Lin, Tian
    Li, Jian
    Chen, Wei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [8] Playing Repeated Network Interdiction Games with Semi-Bandit Feedback
    Guo, Qingyu
    An, Bo
    Tran-Thanh, Long
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3682 - 3690
  • [9] Efficient Pure Exploration for Combinatorial Bandits with Semi-Bandit Feedback
    Jourdan, Marc
    Mutny, Mojmir
    Kirschner, Johannes
    Krause, Andreas
    ALGORITHMIC LEARNING THEORY, VOL 132, 2021, 132
  • [10] Online Second Price Auction with Semi-Bandit Feedback under the Non-Stationary Setting
    Zhao, Haoyu
    Chen, Wei
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6893 - 6900