A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation

被引:0
|
作者
Fujimoto, Scott [1 ]
Meger, David [1 ]
Precup, Doina [1 ]
机构
[1] McGill Univ, Mila, Montreal, PQ, Canada
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139 | 2021年 / 139卷
基金
加拿大自然科学与工程研究理事会;
关键词
ENVIRONMENT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Marginalized importance sampling (MIS), which measures the density ratio between the state-action occupancy of a target policy and that of a sampling distribution, is a promising approach for off-policy evaluation. However, current state-of-the-art MIS methods rely on complex optimization tricks and succeed mostly on simple toy problems. We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy. The successor representation can be trained through deep reinforcement learning methodology and decouples the reward optimization from the dynamics of the environment, making the resulting algorithm stable and applicable to high-dimensional domains. We evaluate the empirical performance of our approach on a variety of challenging Atari and MuJoCo environments.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Associative Learning of an Unnormalized Successor Representation
    Verosky, Niels J.
    NEURAL COMPUTATION, 2024, 36 (07) : 1410 - 1423
  • [22] A reinforcement learning approach to rare trajectory sampling
    Rose, Dominic C.
    Mair, Jamie F.
    Garrahan, Juan P.
    NEW JOURNAL OF PHYSICS, 2021, 23 (01):
  • [23] A Novel Adaptive Sampling Strategy for Deep Reinforcement Learning
    Liang, Xingxing
    Chen, Li
    Feng, Yanghe
    Liu, Zhong
    Ma, Yang
    Huang, Kuihua
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2021, 20 (02)
  • [24] Bayesian Reinforcement Learning via Deep, Sparse Sampling
    Grover, Divya
    Basu, Debabrota
    Dimitrakakis, Christos
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 3036 - 3044
  • [25] Deep Reinforcement Learning With Graph Representation for Vehicle Repositioning
    Yu, Zishun
    Hu, Mengqi
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (08) : 13094 - 13107
  • [26] A State Representation Dueling Network for Deep Reinforcement Learning
    Qiu, Haomin
    Liu, Feng
    2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 669 - 674
  • [27] Competitive-cooperative-concurrent reinforcement learning with importance sampling
    Uchibe, E
    Doya, K
    FROM ANIMALS TO ANIMATS 8, 2004, : 287 - 296
  • [28] Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation
    Salimibeni, Mohammad
    Mohammadi, Arash
    Malekzadeh, Parvin
    Plataniotis, Konstantinos N.
    SENSORS, 2022, 22 (04)
  • [29] Efficient Deep Reinforcement Learning via Policy-Extended Successor Feature Approximator
    Li, Yining
    Yang, Tianpei
    Hao, Jianye
    Zheng, Yan
    Tang, Hongyao
    DISTRIBUTED ARTIFICIAL INTELLIGENCE, DAI 2022, 2023, 13824 : 29 - 44
  • [30] Traffic Signal Control with Successor Feature-Based Deep Reinforcement Learning Agent
    Szoke, Laszlo
    Aradi, Szilard
    Becsi, Tamas
    ELECTRONICS, 2023, 12 (06)