共 50 条
- [36] Advantage Based Value Iteration for Markov Decision Processes with Unknown Rewards 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 3837 - 3844
- [38] Efficient Off-Policy Algorithms for Structured Markov Decision Processes 2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 8312 - 8319