共 50 条
- [32] Fast Probabilistic Policy Reuse via Reward Function Fitting 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
- [33] Boosting Policy Learning in Reinforcement Learning via Adaptive Intrinsic Reward Regulation IEEE ACCESS, 2024, 12 : 2224 - 2235
- [35] LEARNING AND TEACHING THROUGH DISCUSSION CENTRAL STATES SPEECH JOURNAL, 1962, 13 (03): : 198 - 198
- [38] BATCH POLICY LEARNING IN AVERAGE REWARD MARKOV DECISION PROCESSES ANNALS OF STATISTICS, 2022, 50 (06): : 3364 - 3387
- [39] Reward-Free Policy Space Compression for Reinforcement Learning INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
- [40] Pessimistic Reward Models for Off-Policy Learning in Recommendation 15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021), 2021, : 63 - 74