Reward Identification in Inverse Reinforcement Learning

被引:0
|
作者
Kim, Kuno [1 ]
Garg, Shivam [1 ]
Shiragur, Kirankumar [1 ]
Ermon, Stefano [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Palo Alto, CA 94304 USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139 | 2021年 / 139卷
关键词
DYNAMIC DISCRETE-CHOICE; MODELS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of reward identifiability in the context of Inverse Reinforcement Learning (IRL). The reward identifiability question is critical to answer when reasoning about the effectiveness of using Markov Decision Processes (MDPs) as computational models of real world decision makers in order to understand complex decision making behavior and perform counterfactual reasoning. While identifiability has been acknowledged as a fundamental theoretical question in IRL, little is known about the types of MDPs for which rewards are identifiable, or even if there exist such MDPs. In this work, we formalize the reward identification problem in IRL and study how identifiability relates to properties of the MDP model. For deterministic MDP models with the MaxEntRL objective, we prove necessary and sufficient conditions for identifiability. Building on these results, we present efficient algorithms for testing whether or not an MDP model is identifiable.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Compatible Reward Inverse Reinforcement Learning
    Metelli, Alberto Maria
    Pirotta, Matteo
    Restelli, Marcello
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [2] Active Learning for Reward Estimation in Inverse Reinforcement Learning
    Lopes, Manuel
    Melo, Francisco
    Montesano, Luis
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2009, 5782 : 31 - +
  • [3] Option compatible reward inverse reinforcement learning
    Hwang, Rakhoon
    Lee, Hanjin
    Hwang, Hyung Ju
    PATTERN RECOGNITION LETTERS, 2022, 154 : 83 - 89
  • [4] Inverse Reinforcement Learning with the Average Reward Criterion
    Wu, Feiyang
    Ke, Jingyang
    Wu, Anqi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] Inverse Reinforcement Learning with Locally Consistent Reward Functions
    Quoc Phong Nguyen
    Low, Kian Hsiang
    Jaillet, Patrick
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [6] Inverse Reinforcement Learning for Strategy Identification
    Rucker, Mark
    Adams, Stephen
    Hayes, Roy
    Beling, Peter A.
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 3067 - 3074
  • [7] Modified reward function on abstract features in inverse reinforcement learning
    Shenyi CHENHui QIANJia FANZhuojun JINMiaoliang ZHUSchool of Computer Science and TechnologyZhejiang UniversityHangzhou China
    Journal of Zhejiang University-Science C(Computer & Electronics), 2010, 11 (09) : 718 - 723
  • [8] Reward Function Using Inverse Reinforcement Learning and Fuzzy Reasoning
    Kato, Yuta
    Kanoh, Masayoshi
    Nakamura, Tsuyoshi
    2020 JOINT 11TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 21ST INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS-ISIS), 2020, : 222 - 227
  • [10] Modified reward function on abstract features in inverse reinforcement learning
    Chen, Shen-yi
    Qian, Hui
    Fan, Jia
    Jin, Zhuo-jun
    Zhu, Miao-liang
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2010, 11 (09): : 718 - 723