共 50 条
- [1] A LIMIT THEOREM FOR MARKOV DECISION PROCESSES JOURNAL OF DYNAMICS AND GAMES, 2014, 1 (04): : 639 - 659
- [2] Online Convex Optimization in Adversarial Markov Decision Processes INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
- [3] Learning Adversarial Markov Decision Processes with Delayed Feedback THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7281 - 7289
- [5] An envelope theorem and some applications to discounted Markov decision processes Mathematical Methods of Operations Research, 2008, 67 : 299 - 321
- [6] Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
- [7] Robust Lagrangian and Adversarial Policy Gradient for Robust Constrained Markov Decision Processes 2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 1227 - 1239
- [8] Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
- [9] A LaSalle version of Matrosov theorem 2011 50TH IEEE CONFERENCE ON DECISION AND CONTROL AND EUROPEAN CONTROL CONFERENCE (CDC-ECC), 2011, : 320 - 324