Preference-based online learning with dueling bandits: A survey

被引:0
|
作者
Bengs, Viktor [1 ]
Busa-Fekete, Robert [2 ]
Mesaoudi-Paul, Adil El [1 ]
Hullermeier, Eyke [1 ]
机构
[1] Heinz Nixdorf Institute, Department of Computer Science, Paderborn University, Germany
[2] Google Research, New York,NY, United States
关键词
D O I
暂无
中图分类号
学科分类号
摘要
引用
收藏
相关论文
共 50 条
  • [31] Preference-based Teaching
    Gao, Ziyuan
    Ries, Christoph
    Simon, Hans U.
    Zilles, Sandra
    JOURNAL OF MACHINE LEARNING RESEARCH, 2017, 18 : 1 - 32
  • [32] Preference-based unawareness
    Schipper, Burkhard C.
    MATHEMATICAL SOCIAL SCIENCES, 2014, 70 : 34 - 41
  • [33] APReL: A Library for Active Preference-based Reward Learning Algorithms
    Biyik, Erdem
    Talati, Aditi
    Sadigh, Dorsa
    PROCEEDINGS OF THE 2022 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI '22), 2022, : 613 - 617
  • [34] Preference-Based Assistance Map Learning With Robust Adaptive Oscillators
    Li, Shilei
    Zou, Wulin
    Duan, Pu
    Shi, Ling
    IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2022, 4 (04): : 1000 - 1009
  • [35] Preference-based decision making for personalised access to Learning Resources
    Department of Special Education, University of Thessaly, Argonafton and Filellinon Street, Volos, GR 38221, Greece
    不详
    不详
    Int. J. Auton. Adapt. Commun. Syst., 2008, 3 (356-369):
  • [36] Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation
    Ren, Zhizhou
    Liu, Anji
    Liang, Yitao
    Peng, Jian
    Ma, Jianzhu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [37] A Policy Iteration Algorithm for Learning from Preference-Based Feedback
    Wirth, Christian
    Furnkranz, Johannes
    ADVANCES IN INTELLIGENT DATA ANALYSIS XII, 2013, 8207 : 427 - 437
  • [38] Active Preference-Based Gaussian Process Regression for Reward Learning
    Biyik, Lirdem
    Huynh, Nicolas
    Kochenderfer, Mykel J.
    Sadigh, Dorsa
    ROBOTICS: SCIENCE AND SYSTEMS XVI, 2020,
  • [39] Preference-based Reinforcement Learning with Finite-Time Guarantees
    Xu, Yichong
    Wang, Ruosong
    Yang, Lin F.
    Singh, Aarti
    Dubrawski, Artur
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [40] Preference-based valuation of treatment attributes in haemophilia A using web survey
    Carlsson, K. Steen
    Andersson, E.
    Berntorp, E.
    HAEMOPHILIA, 2017, 23 (06) : 894 - 903