共 50 条
- [2] New value iteration and Q-learning methods for the average cost dynamic programming problem PROCEEDINGS OF THE 37TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4, 1998, : 2692 - 2697
- [3] New value iteration and Q-learning methods for the average cost dynamic programming problem Proc IEEE Conf Decis Control, (2692-2697):
- [5] Q-learning and policy iteration algorithms for stochastic shortest path problems Annals of Operations Research, 2013, 208 : 95 - 132
- [7] A dynamic channel assignment policy through Q-learning IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (06): : 1443 - 1455
- [8] Discounted UCB1-tuned for Q-Learning 2014 JOINT 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 15TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2014, : 966 - 970
- [10] Dynamic programming with NAR model versus Q-learning - Case study NEURAL NETWORKS AND SOFT COMPUTING, 2003, : 728 - 733