Reward Identification in Inverse Reinforcement Learning

被引：0

作者：

Kim, Kuno ^{[1
]}

Garg, Shivam ^{[1
]}

Shiragur, Kirankumar ^{[1
]}

Ermon, Stefano ^{[1
]}

机构：

[1] Stanford Univ, Dept Comp Sci, Palo Alto, CA 94304 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139 | 2021年 / 139卷

关键词：

DYNAMIC DISCRETE-CHOICE; MODELS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study the problem of reward identifiability in the context of Inverse Reinforcement Learning (IRL). The reward identifiability question is critical to answer when reasoning about the effectiveness of using Markov Decision Processes (MDPs) as computational models of real world decision makers in order to understand complex decision making behavior and perform counterfactual reasoning. While identifiability has been acknowledged as a fundamental theoretical question in IRL, little is known about the types of MDPs for which rewards are identifiable, or even if there exist such MDPs. In this work, we formalize the reward identification problem in IRL and study how identifiability relates to properties of the MDP model. For deterministic MDP models with the MaxEntRL objective, we prove necessary and sufficient conditions for identifiability. Building on these results, we present efficient algorithms for testing whether or not an MDP model is identifiable.

引用

页数：10

共 50 条

[1] Compatible Reward Inverse Reinforcement Learning
Metelli, Alberto Maria
Pirotta, Matteo
Restelli, Marcello
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[2] Active Learning for Reward Estimation in Inverse Reinforcement Learning
Lopes, Manuel
Melo, Francisco
Montesano, Luis
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2009, 5782 : 31 - +
[3] Option compatible reward inverse reinforcement learning
Hwang, Rakhoon
Lee, Hanjin
Hwang, Hyung Ju
PATTERN RECOGNITION LETTERS, 2022, 154 : 83 - 89
[4] Inverse Reinforcement Learning with the Average Reward Criterion
Wu, Feiyang
Ke, Jingyang
Wu, Anqi
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[5] Inverse Reinforcement Learning with Locally Consistent Reward Functions
Quoc Phong Nguyen
Low, Kian Hsiang
Jaillet, Patrick
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
[6] Inverse Reinforcement Learning for Strategy Identification
Rucker, Mark
Adams, Stephen
Hayes, Roy
Beling, Peter A.
2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 3067 - 3074
[7] Modified reward function on abstract features in inverse reinforcement learning
Shenyi CHENHui QIANJia FANZhuojun JINMiaoliang ZHUSchool of Computer Science and TechnologyZhejiang UniversityHangzhou China
Journal of Zhejiang University-Science C(Computer & Electronics), 2010, 11 (09) : 718 - 723
[8] Reward Function Using Inverse Reinforcement Learning and Fuzzy Reasoning
Kato, Yuta
Kanoh, Masayoshi
Nakamura, Tsuyoshi
2020 JOINT 11TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 21ST INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS-ISIS), 2020, : 222 - 227
[9] Modified reward function on abstract features in inverse reinforcement learning
Shen-yi CHEN
Frontiers of Information Technology & Electronic Engineering, 2010, (09) : 718 - 723
[10] Modified reward function on abstract features in inverse reinforcement learning
Chen, Shen-yi
Qian, Hui
Fan, Jia
Jin, Zhuo-jun
Zhu, Miao-liang
JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2010, 11 (09): : 718 - 723

← 1 2 3 4 5 →