共 50 条
- [31] Batch reinforcement learning with state importance MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 : 566 - 568
- [32] Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
- [34] A Generalized Acquisition Function for Preference-based Reward Learning 2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 2814 - 2821
- [36] Inverse Preference Learning: Preference-based RL without a Reward Function ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [37] A Survey of Preference-Based Online Learning with Bandit Algorithms ALGORITHMIC LEARNING THEORY (ALT 2014), 2014, 8776 : 18 - 39
- [38] Embedding Learning for Preference-based Speech Quality Assessment INTERSPEECH 2024, 2024, : 2685 - 2689
- [40] APReL: A Library for Active Preference-based Reward Learning Algorithms PROCEEDINGS OF THE 2022 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI '22), 2022, : 613 - 617