共 50 条
- [21] Minimax Value Interval for Off-Policy Evaluation and Policy Optimization ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
- [22] MULTI-ARMED BANDITS AND THE GITTINS INDEX JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1980, 42 (02): : 143 - 149
- [23] Multi-armed bandits with episode context Annals of Mathematics and Artificial Intelligence, 2011, 61 : 203 - 230
- [25] Active Learning in Multi-armed Bandits ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 2008, 5254 : 287 - +
- [27] Multi-Armed Bandits with Cost Subsidy 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
- [28] Batched Multi-armed Bandits Problem ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
- [29] Are Multi-Armed Bandits Susceptible to Peeking? ZAGREB INTERNATIONAL REVIEW OF ECONOMICS & BUSINESS, 2018, 21 (01): : 95 - 104
- [30] Secure Outsourcing of Multi-Armed Bandits 2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 202 - 209