共 50 条
- [41] Best-of-Three-Worlds Linear Bandit Algorithm with Variance-Adaptive Regret Bounds THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
- [42] Problem-dependent regret bounds for online learning with feedback graphs 35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 852 - 861
- [43] Provably Efficient Reinforcement Learning with Linear Function Approximation under Adaptivity Constraints ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
- [44] Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz Dynamic Risk Measures INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
- [45] On Gap-dependent Bounds for Offline Reinforcement Learning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [46] Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
- [47] Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [48] Uniform-PAC Bounds for Reinforcement Learning with Linear Function Approximation ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
- [49] A Tighter Problem-Dependent Regret Bound for Risk-Sensitive Reinforcement Learning INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206