共 50 条
- [1] Multigrid Reinforcement Learning with Reward Shaping ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT I, 2008, 5163 : 357 - 366
- [2] Error bounds in reinforcement learning policy evaluation ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2005, 3501 : 438 - 449
- [3] Least Square Policy Evaluation in Reinforcement Learning INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND INDUSTRIAL AUTOMATION (ICITIA 2015), 2015, : 583 - 590
- [4] Independent Policy Gradient Methods for Competitive Reinforcement Learning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
- [5] Policy gradient methods for reinforcement learning with function approximation ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 1057 - 1063
- [10] A perspective on off-policy evaluation in reinforcement learning Frontiers of Computer Science, 2019, 13 : 911 - 912