Preference-based online learning with dueling bandits: A survey

被引:0
|
作者
Bengs, Viktor [1 ]
Busa-Fekete, Robert [2 ]
Mesaoudi-Paul, Adil El [1 ]
Hullermeier, Eyke [1 ]
机构
[1] Heinz Nixdorf Institute, Department of Computer Science, Paderborn University, Germany
[2] Google Research, New York,NY, United States
关键词
D O I
暂无
中图分类号
学科分类号
摘要
引用
收藏
相关论文
共 50 条
  • [21] Active Preference-Based Learning of Reward Functions
    Sadigh, Dorsa
    Dragan, Anca D.
    Sastry, Shankar
    Seshia, Sanjit A.
    ROBOTICS: SCIENCE AND SYSTEMS XIII, 2017,
  • [22] Learning solution similarity in preference-based CBR
    Abdel-Aziz, Amira
    Strickert, Marc
    Hüllermeier, Eyke
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8765 : 17 - 31
  • [23] Versatile Dueling Bandits: Best-of-both World Analyses for Online Learning from Relative Preferences
    Saha, Aadirupa
    Gaillard, Pierre
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 19011 - 19026
  • [24] Inverse Preference Learning: Preference-based RL without a Reward Function
    Hejna, Joey
    Sadigh, Dorsa
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [25] Online Certification of Preference-Based Fairness for Personalized Recommender Systems
    Do, Virginie
    Corbett-Davies, Sam
    Atif, Jamal
    Usunier, Nicolas
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6532 - 6540
  • [26] Online Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach
    Szorenyi, Balazs
    Busa-Fekete, Robert
    Paul, Adil
    Huellermeier, Eyke
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [27] A Generalized Acquisition Function for Preference-based Reward Learning
    Ellis, Evan
    Ghosal, Gaurav R.
    Russell, Stuart J.
    Dragan, Anca
    Biyik, Erdem
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 2814 - 2821
  • [28] Model-Free Preference-Based Reinforcement Learning
    Wirth, Christian
    Fuernkranz, Johannes
    Neumann, Gerhard
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2222 - 2228
  • [29] Embedding Learning for Preference-based Speech Quality Assessment
    Hu, Cheng-Hung
    Yasuda, Yusuke
    Toda, Tomoki
    INTERSPEECH 2024, 2024, : 2685 - 2689
  • [30] Learning to Identify Top Elo Ratings: A Dueling Bandits Approach
    Yan, Xue
    Du, Yali
    Ru, Binxin
    Wang, Jun
    Zhang, Haifeng
    Chen, Xu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8797 - 8805