共 50 条
- [4] Data-driven policy iteration algorithm for optimal control of continuous-time Ito stochastic systems with Markovian jumps IET CONTROL THEORY AND APPLICATIONS, 2016, 10 (12): : 1431 - 1439
- [6] Q-learning and policy iteration algorithms for stochastic shortest path problems Annals of Operations Research, 2013, 208 : 95 - 132