共 50 条
- [1] Reward estimation for dialogue policy optimisation COMPUTER SPEECH AND LANGUAGE, 2018, 51 : 24 - 43
- [2] WeaSuLπ: Weakly Supervised Dialogue Policy Learning: Reward Estimation for Multi-turn Dialogue 1ST WORKSHOP ON DOCUMENT-GROUNDED DIALOGUE AND CONVERSATIONAL QUESTION ANSWERING (DIALDOC 2021), 2021, : 69 - 80
- [3] Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 660 - 670
- [4] Domain-independent User Satisfaction Reward Estimation for Dialogue Policy Learning 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1721 - 1725
- [5] Combining Curriculum Learning and Knowledge Distillation for Dialogue Generation FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1284 - 1295
- [6] On the Applicability of a User Satisfaction-Based Reward for Dialogue Policy Learning ADVANCED SOCIAL INTERACTION WITH AGENTS, 2019, 510 : 211 - 217
- [7] On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 2431 - 2441
- [8] Reward Function Learning for Dialogue Management PROCEEDINGS OF THE SIXTH STARTING AI RESEARCHERS' SYMPOSIUM (STAIRS 2012), 2012, 241 : 95 - +
- [9] HIERARCHICAL KNOWLEDGE DISTILLATION FOR DIALOGUE SEQUENCE LABELING 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 433 - 440
- [10] DROID: Learning from Offline Heterogeneous Demonstrations via Reward-Policy Distillation CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229