共 50 条
- [21] Regret bounds for reinforcement learning via markov chain concentration Journal of Artificial Intelligence Research, 2020, 67 : 115 - 128
- [22] Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement Learning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [23] Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [24] Regret Bounds for Reinforcement Learning via Markov Chain Concentration JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2020, 67 : 115 - 128
- [25] Improved Regret Analysis for Variance-Adaptive Linear Bandits and Horizon-Free Linear Mixture MDPs ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
- [26] Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
- [27] Logarithmic Regret for Reinforcement Learning with Linear Function Approximation INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
- [28] On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function Approximation THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9310 - 9318
- [29] Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30