Learning state importance for preference-based reinforcement learning

被引：5

作者：

Zhang, Guoxi ^{[1
]}

Kashima, Hisashi ^{[1
,2
]}

机构：

[1] Kyoto Univ, Grad Sch Informat, Yoshida Honmachi, Kyoto 6068501, Japan

[2] RIKEN Guardian Robot Project, Kyoto, Japan

来源：

MACHINE LEARNING | 2023年 / 113卷 / 4期

关键词：

Interpretable reinforcement learning; Preference-based reinforcement learning; Human-in-the-loop reinforcement learning; Interpretability artificial intelligence;

D O I：

10.1007/s10994-022-06295-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Preference-based reinforcement learning (PbRL) develops agents using human preferences. Due to its empirical success, it has prospect of benefiting human-centered applications. Meanwhile, previous work on PbRL overlooks interpretability, which is an indispensable element of ethical artificial intelligence (AI). While prior art for explainable AI offers some machinery, there lacks an approach to select samples to construct explanations. This becomes an issue for PbRL, as transitions relevant to task solving are often outnumbered by irrelevant ones. Thus, ad-hoc sample selection undermines the credibility of explanations. The present study proposes a framework for learning reward functions and state importance from preferences simultaneously. It offers a systematic approach for selecting samples when constructing explanations. Moreover, the present study proposes a perturbation analysis to evaluate the learned state importance quantitatively. Through experiments on discrete and continuous control tasks, the present study demonstrates the proposed framework's efficacy for providing interpretability without sacrificing task performance.

引用

页码：1885 / 1901

页数：17

共 50 条

[31] Batch reinforcement learning with state importance
Li, LH
Bulitko, V
Greiner, R
MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 : 566 - 568
[32] Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation
Chen, Xiaoyu
Zhong, Han
Yang, Zhuoran
Wang, Zhaoran
Wang, Liwei
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[33] Preference-based online learning with dueling bandits: A survey
Bengs, Viktor
Busa-Fekete, Robert
Mesaoudi-Paul, Adil El
Hullermeier, Eyke
Journal of Machine Learning Research, 2021, 22
[34] A Generalized Acquisition Function for Preference-based Reward Learning
Ellis, Evan
Ghosal, Gaurav R.
Russell, Stuart J.
Dragan, Anca
Biyik, Erdem
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 2814 - 2821
[35] Preference-based Online Learning with Dueling Bandits: A Survey
Bengs, Viktor
Busa-Fekete, Robert
El Mesaoudi-Paul, Adil
Huellermeier, Eyke
JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
[36] Inverse Preference Learning: Preference-based RL without a Reward Function
Hejna, Joey
Sadigh, Dorsa
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[37] A Survey of Preference-Based Online Learning with Bandit Algorithms
Busa-Fekete, Robert
Huellermeier, Eyke
ALGORITHMIC LEARNING THEORY (ALT 2014), 2014, 8776 : 18 - 39
[38] Embedding Learning for Preference-based Speech Quality Assessment
Hu, Cheng-Hung
Yasuda, Yusuke
Toda, Tomoki
INTERSPEECH 2024, 2024, : 2685 - 2689
[39] User Preference-Based Demand Response for Smart Home Energy Management Using Multiobjective Reinforcement Learning
Chen, Song-Jen
Chiu, Wei-Yu
Liu, Wei-Jen
IEEE ACCESS, 2021, 9 : 161627 - 161637
[40] APReL: A Library for Active Preference-based Reward Learning Algorithms
Biyik, Erdem
Talati, Aditi
Sadigh, Dorsa
PROCEEDINGS OF THE 2022 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI '22), 2022, : 613 - 617

← 1 2 3 4 5 →