Inverse Reinforcement Learning from a Gradient-based Learner

被引:0
|
作者
Ramponi, Giorgia [1 ]
Drappo, Gianluca [1 ]
Restelli, Marcello [1 ]
机构
[1] Politecn Milan, Milan, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inverse Reinforcement Learning addresses the problem of inferring an expert's reward function from demonstrations. However, in many applications, we not only have access to the expert's near-optimal behaviour, but we also observe part of her learning process. In this paper, we propose a new algorithm for this setting, in which the goal is to recover the reward function being optimized by an agent, given a sequence of policies produced during learning. Our approach is based on the assumption that the observed agent is updating her policy parameters along the gradient direction. Then we extend our method to deal with the more realistic scenario where we only have access to a dataset of learning trajectories. For both settings, we provide theoretical insights into our algorithms' performance. Finally, we evaluate the approach in a simulated GridWorld environment and on the MuJoCo environments, comparing it with the state-of-the-art baseline.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Inverse design of grating couplers using the policy gradient method from reinforcement learning
    Hooten, Sean
    Beausoleil, Raymond G.
    Van Vaerenbergh, Thomas
    NANOPHOTONICS, 2021, 10 (15) : 3843 - 3856
  • [42] Gradient-Based Neuromorphic Learning on Dynamical RRAM Arrays
    Zhou, Peng
    Choi, Dong-Uk
    Lu, Wei D.
    Kang, Sung-Mo
    Eshraghian, Jason K.
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2022, 12 (04) : 888 - 897
  • [43] Signal Propagation in a Gradient-Based and Evolutionary Learning System
    Toutouh, Jamal
    O'reilly, Una-May
    PROCEEDINGS OF THE 2021 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'21), 2021, : 377 - 385
  • [44] Global Optimality in Bivariate Gradient-based DAG Learning
    Deng, Chang
    Bello, Kevin
    Ravikumar, Pradeep
    Aragam, Bryon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [45] Gradient-Based Feature Learning under Structured Data
    Mousavi-Hosseini, Alireza
    Wu, Denny
    Suzuki, Taiji
    Erdogdu, Murat A.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [46] EFFICIENT INVERSE DESIGN OF ACOUSTIC METAMATERIALS USING GRADIENT-BASED OPTIMIZATION
    Gerges, Samer
    Amirkulova, Feruza A.
    Samaniego, Jovana
    PROCEEDINGS OF ASME 2023 INTERNATIONAL MECHANICAL ENGINEERING CONGRESS AND EXPOSITION, IMECE2023, VOL 4, 2023,
  • [47] A systematic approach to robust preconditioning for gradient-based inverse scattering algorithms
    Nordebo, Sven
    Fhager, Andreas
    Gustafsson, Mats
    Persson, Mikael
    INVERSE PROBLEMS, 2008, 24 (02)
  • [48] A Gradient-Based Inverse Lithography Technology for Double-Dipole Lithography
    Xiong, Wei
    Zhang, Jinyu
    Wang, Yan
    Yu, Zhiping
    Tsai, Min-Chun
    2009 INTERNATIONAL CONFERENCE ON SIMULATION OF SEMICONDUCTOR PROCESSES AND DEVICES, 2009, : 107 - +
  • [49] Solving inverse electromagnetic problems using FDTD and gradient-based minimization
    Abenius, Erik
    Strand, Bo
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 2006, 68 (06) : 650 - 673
  • [50] Provable Guarantees for Gradient-Based Meta-Learning
    Khodak, Mikhail
    Balcan, Maria-Florina
    Talwalkar, Ameet
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97