共 48 条
- [21] Learning Equilibria in Matching Markets from Bandit Feedback ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
- [22] Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
- [23] Learning Structured Predictors from Bandit Feedback for Interactive NLP PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1610 - 1620
- [24] Counterfactual Risk Minimization: Learning from Logged Bandit Feedback INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 814 - 823
- [26] Online Multiclass Learning with "Bandit" Feedback under a Confidence-Weighted Approach 2016 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2016,
- [27] Simulating Bandit Learning from User Feedback for Extractive Question Answering PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5167 - 5179
- [28] Risk-Averse Trees for Learning from Logged Bandit Feedback 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 976 - 983
- [29] Targeting Optimization for Internet Advertising by Learning from Logged Bandit Feedback 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,