A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation

被引：0

作者：

Fujimoto, Scott ^{[1
]}

Meger, David ^{[1
]}

Precup, Doina ^{[1
]}

机构：

[1] McGill Univ, Mila, Montreal, PQ, Canada

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139 | 2021年 / 139卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

ENVIRONMENT;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Marginalized importance sampling (MIS), which measures the density ratio between the state-action occupancy of a target policy and that of a sampling distribution, is a promising approach for off-policy evaluation. However, current state-of-the-art MIS methods rely on complex optimization tricks and succeed mostly on simple toy problems. We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy. The successor representation can be trained through deep reinforcement learning methodology and decouples the reward optimization from the dynamics of the environment, making the resulting algorithm stable and applicable to high-dimensional domains. We evaluate the empirical performance of our approach on a variety of challenging Atari and MuJoCo environments.

引用

页数：12

共 50 条

[21] Associative Learning of an Unnormalized Successor Representation
Verosky, Niels J.
NEURAL COMPUTATION, 2024, 36 (07) : 1410 - 1423
[22] A reinforcement learning approach to rare trajectory sampling
Rose, Dominic C.
Mair, Jamie F.
Garrahan, Juan P.
NEW JOURNAL OF PHYSICS, 2021, 23 (01):
[23] A Novel Adaptive Sampling Strategy for Deep Reinforcement Learning
Liang, Xingxing
Chen, Li
Feng, Yanghe
Liu, Zhong
Ma, Yang
Huang, Kuihua
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2021, 20 (02)
[24] Bayesian Reinforcement Learning via Deep, Sparse Sampling
Grover, Divya
Basu, Debabrota
Dimitrakakis, Christos
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 3036 - 3044
[25] Deep Reinforcement Learning With Graph Representation for Vehicle Repositioning
Yu, Zishun
Hu, Mengqi
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (08) : 13094 - 13107
[26] A State Representation Dueling Network for Deep Reinforcement Learning
Qiu, Haomin
Liu, Feng
2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 669 - 674
[27] Competitive-cooperative-concurrent reinforcement learning with importance sampling
Uchibe, E
Doya, K
FROM ANIMALS TO ANIMATS 8, 2004, : 287 - 296
[28] Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation
Salimibeni, Mohammad
Mohammadi, Arash
Malekzadeh, Parvin
Plataniotis, Konstantinos N.
SENSORS, 2022, 22 (04)
[29] Efficient Deep Reinforcement Learning via Policy-Extended Successor Feature Approximator
Li, Yining
Yang, Tianpei
Hao, Jianye
Zheng, Yan
Tang, Hongyao
DISTRIBUTED ARTIFICIAL INTELLIGENCE, DAI 2022, 2023, 13824 : 29 - 44
[30] Traffic Signal Control with Successor Feature-Based Deep Reinforcement Learning Agent
Szoke, Laszlo
Aradi, Szilard
Becsi, Tamas
ELECTRONICS, 2023, 12 (06)

← 1 2 3 4 5 →