共 50 条
- [13] Continuous-time Markov Decision Process with Average Reward: Using Reinforcement Learning Method 2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, : 3097 - 3100
- [14] Learning to maximize reward rate: a model based on semi-Markov decision processes FRONTIERS IN NEUROSCIENCE, 2014, 8
- [15] Risk-Sensitivity and Average Optimality in Markov and Semi-Markov Reward Processes 38TH INTERNATIONAL CONFERENCE ON MATHEMATICAL METHODS IN ECONOMICS (MME 2020), 2020, : 537 - 543
- [16] Constrained semi-markov decision processes with average rewards ZOR. Zeitschrift Fuer Operations Research, 1994, 40 (03):
- [17] Semi-Markov Offline Reinforcement Learning for Healthcare CONFERENCE ON HEALTH, INFERENCE, AND LEARNING, VOL 174, 2022, 174 : 119 - 137
- [18] Adaptive Honeypot Engagement Through Reinforcement Learning of Semi-Markov Decision Processes DECISION AND GAME THEORY FOR SECURITY, 2019, 11836 : 196 - 216
- [20] MAXIMAL AVERAGE-REWARD POLICIES FOR SEMI-MARKOV DECISION PROCESSES WITH ARBITRARY STATE AND ACTION SPACE ANNALS OF MATHEMATICAL STATISTICS, 1971, 42 (05): : 1717 - &