共 50 条
- [31] Complete Policy Regret Bounds for Tallying Bandits CONFERENCE ON LEARNING THEORY, VOL 178, 2022, 178
- [32] Routine Bandits: Minimizing Regret on Recurring Problems MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, 2021, 12975 : 3 - 18
- [33] Pure Exploration and Regret Minimization in Matching Bandits INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
- [34] Simple regret for infinitely many armed bandits INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 1133 - 1141
- [35] Optimal Regret Bounds for Collaborative Learning in Bandits INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 237, 2024, 237
- [36] An α-No-Regret Algorithm For Graphical Bilinear Bandits ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [37] Robustness Guarantees for Mode Estimation with an Application to Bandits THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9277 - 9284
- [38] Finite-sample Guarantees for Nash Q-learning with Linear Function Approximation UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 424 - 432
- [39] Near-Optimal Regret Bounds for Contextual Combinatorial Semi-Bandits with Linear Payoff Functions THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9791 - 9798
- [40] Online switching control with stability and regret guarantees LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211