Inverse Reinforcement Learning from a Gradient-based Learner

被引:0
|
作者
Ramponi, Giorgia [1 ]
Drappo, Gianluca [1 ]
Restelli, Marcello [1 ]
机构
[1] Politecn Milan, Milan, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inverse Reinforcement Learning addresses the problem of inferring an expert's reward function from demonstrations. However, in many applications, we not only have access to the expert's near-optimal behaviour, but we also observe part of her learning process. In this paper, we propose a new algorithm for this setting, in which the goal is to recover the reward function being optimized by an agent, given a sequence of policies produced during learning. Our approach is based on the assumption that the observed agent is updating her policy parameters along the gradient direction. Then we extend our method to deal with the more realistic scenario where we only have access to a dataset of learning trajectories. For both settings, we provide theoretical insights into our algorithms' performance. Finally, we evaluate the approach in a simulated GridWorld environment and on the MuJoCo environments, comparing it with the state-of-the-art baseline.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Robust gradient-based iterative learning control
    Owens, D. H.
    Haetoenen, J.
    Daley, S.
    2007 INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL SYSTEMS, 2007, : 143 - 148
  • [32] Gradient-Based Local Causal Structure Learning
    Liang, Jiaxuan
    Wang, Jun
    Yu, Guoxian
    Domeniconi, Carlotta
    Zhang, Xiangliang
    Guo, Maozu
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (01) : 486 - 495
  • [33] Gradient-based learning applied to document recognition
    AT&T Lab-Research, Red Bank, United States
    Proc IEEE, 11 (2278-2323):
  • [34] Masked Gradient-Based Causal Structure Learning
    Ng, Ignavier
    Zhu, Shengyu
    Fang, Zhuangyan
    Li, Haoyang
    Chen, Zhitang
    Wang, Jun
    PROCEEDINGS OF THE 2022 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2022, : 424 - 432
  • [35] Gradient-based learning applied to document recognition
    Lecun, Y
    Bottou, L
    Bengio, Y
    Haffner, P
    PROCEEDINGS OF THE IEEE, 1998, 86 (11) : 2278 - 2324
  • [36] Robust gradient-based iterative learning control
    Owens, D. H.
    Hatonen, J.
    Daley, S.
    2007 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, SENSING, AND CONTROL, VOLS 1 AND 2, 2007, : 163 - 168
  • [37] Gradient-Based Inverse Estimation for a Rainfall-Runoff Model
    Krapu, Christopher
    Borsuk, Mark
    Kumar, Mukesh
    WATER RESOURCES RESEARCH, 2019, 55 (08) : 6625 - 6639
  • [38] Metacognitive Radar: Masking Cognition From an Inverse Reinforcement Learner
    Pattanayak, Kunal
    Krishnamurthy, Vikram
    Berry, Christopher M.
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2023, 59 (06) : 8826 - 8844
  • [39] Reinforcement learning guided Spearman dynamic opposite Gradient-based optimizer for numerical optimization and anchor clustering
    Sun, Kangjian
    Huo, Ju
    Jia, Heming
    Yue, Lin
    JOURNAL OF COMPUTATIONAL DESIGN AND ENGINEERING, 2024, 11 (01) : 12 - 33
  • [40] Policy Gradient-based Integral Reinforcement Learning for Optimal Control Design of Nonaffine Morphing Aircraft Systems
    Lee, Hanna
    Kim, Seong-Hun
    Kim, Youdan
    2020 28TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2020, : 218 - 223