Learning state importance for preference-based reinforcement learning

被引:5
|
作者
Zhang, Guoxi [1 ]
Kashima, Hisashi [1 ,2 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Yoshida Honmachi, Kyoto 6068501, Japan
[2] RIKEN Guardian Robot Project, Kyoto, Japan
关键词
Interpretable reinforcement learning; Preference-based reinforcement learning; Human-in-the-loop reinforcement learning; Interpretability artificial intelligence;
D O I
10.1007/s10994-022-06295-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Preference-based reinforcement learning (PbRL) develops agents using human preferences. Due to its empirical success, it has prospect of benefiting human-centered applications. Meanwhile, previous work on PbRL overlooks interpretability, which is an indispensable element of ethical artificial intelligence (AI). While prior art for explainable AI offers some machinery, there lacks an approach to select samples to construct explanations. This becomes an issue for PbRL, as transitions relevant to task solving are often outnumbered by irrelevant ones. Thus, ad-hoc sample selection undermines the credibility of explanations. The present study proposes a framework for learning reward functions and state importance from preferences simultaneously. It offers a systematic approach for selecting samples when constructing explanations. Moreover, the present study proposes a perturbation analysis to evaluate the learned state importance quantitatively. Through experiments on discrete and continuous control tasks, the present study demonstrates the proposed framework's efficacy for providing interpretability without sacrificing task performance.
引用
收藏
页码:1885 / 1901
页数:17
相关论文
共 50 条
  • [31] Batch reinforcement learning with state importance
    Li, LH
    Bulitko, V
    Greiner, R
    MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 : 566 - 568
  • [32] Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation
    Chen, Xiaoyu
    Zhong, Han
    Yang, Zhuoran
    Wang, Zhaoran
    Wang, Liwei
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [33] Preference-based online learning with dueling bandits: A survey
    Bengs, Viktor
    Busa-Fekete, Robert
    Mesaoudi-Paul, Adil El
    Hullermeier, Eyke
    Journal of Machine Learning Research, 2021, 22
  • [34] A Generalized Acquisition Function for Preference-based Reward Learning
    Ellis, Evan
    Ghosal, Gaurav R.
    Russell, Stuart J.
    Dragan, Anca
    Biyik, Erdem
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 2814 - 2821
  • [35] Preference-based Online Learning with Dueling Bandits: A Survey
    Bengs, Viktor
    Busa-Fekete, Robert
    El Mesaoudi-Paul, Adil
    Huellermeier, Eyke
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [36] Inverse Preference Learning: Preference-based RL without a Reward Function
    Hejna, Joey
    Sadigh, Dorsa
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [37] A Survey of Preference-Based Online Learning with Bandit Algorithms
    Busa-Fekete, Robert
    Huellermeier, Eyke
    ALGORITHMIC LEARNING THEORY (ALT 2014), 2014, 8776 : 18 - 39
  • [38] Embedding Learning for Preference-based Speech Quality Assessment
    Hu, Cheng-Hung
    Yasuda, Yusuke
    Toda, Tomoki
    INTERSPEECH 2024, 2024, : 2685 - 2689
  • [39] User Preference-Based Demand Response for Smart Home Energy Management Using Multiobjective Reinforcement Learning
    Chen, Song-Jen
    Chiu, Wei-Yu
    Liu, Wei-Jen
    IEEE ACCESS, 2021, 9 : 161627 - 161637
  • [40] APReL: A Library for Active Preference-based Reward Learning Algorithms
    Biyik, Erdem
    Talati, Aditi
    Sadigh, Dorsa
    PROCEEDINGS OF THE 2022 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI '22), 2022, : 613 - 617