共 50 条
- [31] Online Regret Bounds for Markov Decision Processes with Deterministic Transitions ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 2008, 5254 : 123 - 137
- [34] Simple Regret Optimization in Online Planning for Markov Decision Processes JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2014, 51 : 165 - 205
- [35] Learning Policies for Markov Decision Processes in Continuous Spaces 2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 4751 - 4758
- [36] Active Learning of Markov Decision Processes for System Verification 2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2, 2012, : 289 - 294
- [37] Active learning in partially observable Markov decision processes MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 601 - 608
- [40] Learning Adversarial Markov Decision Processes with Delayed Feedback THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7281 - 7289