共 50 条
- [31] Reinforcement online learning to rank with unbiased reward shaping INFORMATION RETRIEVAL JOURNAL, 2022, 25 (04): : 386 - 413
- [32] Reinforcement online learning to rank with unbiased reward shaping Information Retrieval Journal, 2022, 25 : 386 - 413
- [33] Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 7995 - 8003
- [34] Expressing Arbitrary Reward Functions as Potential-Based Advice PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2652 - 2658
- [35] Multi-Agent Meta-Reinforcement Learning with Coordination and Reward Shaping for Traffic Signal Control ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT II, 2023, 13936 : 349 - 360
- [36] Direct reward and indirect reward in multi-agent reinforcement learning ROBOCUP 2002: ROBOT SOCCER WORLD CUP VI, 2003, 2752 : 359 - 366
- [37] Direct reward and indirect reward in multi-agent reinforcement learning Ohta, M. (ohta@carc.aist.go.jp), (Springer Verlag):
- [38] Adaptively Shaping Reinforcement Learning Agents via Human Reward PRICAI 2018: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2018, 11012 : 85 - 97
- [39] Generalized Maximum Entropy Reinforcement Learning via Reward Shaping IEEE Transactions on Artificial Intelligence, 2024, 5 (04): : 1563 - 1572