Preference-based online learning with dueling bandits: A survey

被引：0

作者：

Bengs, Viktor ^{[1
]}

Busa-Fekete, Robert ^{[2
]}

Mesaoudi-Paul, Adil El ^{[1
]}

Hullermeier, Eyke ^{[1
]}

机构：

[1] Heinz Nixdorf Institute, Department of Computer Science, Paderborn University, Germany

[2] Google Research, New York,NY, United States

来源：

Journal of Machine Learning Research | 2021年 / 22卷

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

引用

共 50 条

[1] Preference-based Online Learning with Dueling Bandits: A Survey
Bengs, Viktor
Busa-Fekete, Robert
El Mesaoudi-Paul, Adil
Huellermeier, Eyke
JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
[2] A Survey of Preference-Based Online Learning with Bandit Algorithms
Busa-Fekete, Robert
Huellermeier, Eyke
ALGORITHMIC LEARNING THEORY (ALT 2014), 2014, 8776 : 18 - 39
[3] Dueling Posterior Sampling for Preference-Based Reinforcement Learning
Novoseller, Ellen R.
Wei, Yibing
Sui, Yanan
Yue, Yisong
Burdick, Joel W.
CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2020), 2020, 124 : 1029 - 1038
[4] Contextual Bandits and Imitation Learning with Preference-Based Active Queries
Sekhari, Ayush
Sridharan, Karthik
Sun, Wen
Wu, Runzhe
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[5] A survey of preference-based reinforcement learning methods
1600, Microtome Publishing (18):
[6] A Survey of Preference-Based Reinforcement Learning Methods
Wirth, Christian
Akrour, Riad
Neumann, Gerhard
Fuernkranz, Johannes
JOURNAL OF MACHINE LEARNING RESEARCH, 2017, 18
[7] Non-stationary Dueling Bandits for Online Learning to Rank
Lu, Shiyin
Miao, Yuan
Yang, Ping
Hu, Yao
Zhang, Lijun
WEB AND BIG DATA, PT II, APWEB-WAIM 2022, 2023, 13422 : 166 - 174
[8] Preference-based learning to rank
Nir Ailon
Mehryar Mohri
Machine Learning, 2010, 80 : 189 - 211
[9] Preference-Based Policy Learning
Akrour, Riad
Schoenauer, Marc
Sebag, Michele
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2011, 6911 : 12 - 27
[10] Preference-based learning to rank
Ailon, Nir
Mohri, Mehryar
MACHINE LEARNING, 2010, 80 (2-3) : 189 - 211

← 1 2 3 4 5 →