共 50 条
- [42] Combining Learning Algorithms: An Approach to Markov Decision Processes ENTERPRISE INFORMATION SYSTEMS, ICEIS 2012, 2013, 141 : 172 - 188
- [43] Hierarchical algorithms for discounted and weighted Markov decision processes Mathematical Methods of Operations Research, 2003, 58 : 237 - 245
- [44] IMED-RL: Regret optimal learning of ergodic Markov decision processes ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [45] Sampling Based Approaches for Minimizing Regret in Uncertain Markov Decision Processes (MDPs) JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2017, 59 : 229 - 264
- [46] Online Learning of Safety function for Markov Decision Processes 2023 EUROPEAN CONTROL CONFERENCE, ECC, 2023,
- [47] Online Convex Optimization in Adversarial Markov Decision Processes INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
- [49] Online Learning in Markov Decision Processes with Continuous Actions ALGORITHMIC LEARNING THEORY, ALT 2015, 2015, 9355 : 302 - 316
- [50] Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 2488 - 2497