共 50 条
- [32] Fast and stable learning of quasi-passive dynamic walking by an unstable biped robot based on off-policy natural actor-critic 2006 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-12, 2006, : 5226 - +
- [33] Off-policy Learning in Two-stage Recommender Systems WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 463 - 473
- [34] Actor-critic algorithms ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 1008 - 1014
- [35] On actor-critic algorithms SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2003, 42 (04) : 1143 - 1166
- [38] Optimal Actor-Critic Policy With Optimized Training Datasets IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2022, 6 (06): : 1324 - 1334
- [39] Policy-Gradient Based Actor-Critic Algorithms PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL III, 2009, : 505 - 509
- [40] Exploring Policy Diversity in Parallel Actor-Critic Learning 2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 1196 - 1203