Efficient Deep Reinforcement Learning via Policy-Extended Successor Feature Approximator

被引:0
|
作者
Li, Yining [1 ]
Yang, Tianpei [1 ,2 ]
Hao, Jianye [1 ]
Zheng, Yan [1 ]
Tang, Hongyao [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China
[2] Univ Alberta, Edmonton, AB, Canada
关键词
Reinforcement learning; Transfer learning; Successor features; Policy representation;
D O I
10.1007/978-3-031-25549-6_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Successor Features (SFs) improve the generalization of Reinforcement Learning across unseen tasks by decoupling the dynamics of the environment from the rewards. However, the decomposition highly depends on the policy learned on the task, which may not be optimal in other tasks. To improve the generalization of SFs, in this paper, we propose a novel SFs learning paradigm, Policy-extended Successor Feature Approximator (PeSFA) which decouples the SFs from the policy by learning a policy representation module and inputting the policy representation to SFs. In this way, when we fit SFs well in the policy representation space, we can directly obtain a better SFs corresponding to any task by searching the policy representation space. Experimental results show that PeSFA significantly improves the generalizability of SFs and accelerates the learning process in two representative environments.
引用
收藏
页码:29 / 44
页数:16
相关论文
共 50 条
  • [1] What about Inputting Policy in Value Function: Policy Representation and Policy-Extended Value Function Approximator
    Tang, Hongyao
    Meng, Zhaopeng
    Hao, Jianye
    Chen, Chen
    Graves, Daniel
    Li, Dong
    Yu, Changmin
    Mao, Hangyu
    Liu, Wulong
    Yang, Yaodong
    Tao, Wenyuan
    Wang, Li
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8441 - 8449
  • [2] Efficient Deep Reinforcement Learning via Adaptive Policy Transfer
    Yang, Tianpei
    Hao, Jianye
    Meng, Zhaopeng
    Zhang, Zongzhang
    Hu, Yujing
    Chen, Yingfeng
    Fan, Changjie
    Wang, Weixun
    Liu, Wulong
    Wang, Zhaodong
    Peng, Jiajie
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3094 - 3100
  • [3] Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement
    Barreto, Andre
    Borsa, Diana
    Quan, John
    Schaul, Tom
    Silver, David
    Hessel, Matteo
    Mankowitz, Daniel
    Zidek, Augustin
    Munos, Remi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [4] Traffic Signal Control with Successor Feature-Based Deep Reinforcement Learning Agent
    Szoke, Laszlo
    Aradi, Szilard
    Becsi, Tamas
    ELECTRONICS, 2023, 12 (06)
  • [5] Efficient Halftoning via Deep Reinforcement Learning
    Jiang, Haitian
    Xiong, Dongliang
    Jiang, Xiaowen
    Ding, Li
    Chen, Liang
    Huang, Kai
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5494 - 5508
  • [6] Deep successor feature learning for text generation
    Xu, Cong
    Li, Qing
    Zhang, Dezheng
    Xie, Yonghong
    Li, Xisheng
    NEUROCOMPUTING, 2020, 396 : 495 - 500
  • [7] Optimized Feature Extraction for Sample Efficient Deep Reinforcement Learning
    Li, Yuangang
    Guo, Tao
    Li, Qinghua
    Liu, Xinyue
    ELECTRONICS, 2023, 12 (16)
  • [8] Efficient Exploration for Multi-Agent Reinforcement Learning via Transferable Successor Features
    Wenzhang Liu
    Lu Dong
    Dan Niu
    Changyin Sun
    IEEE/CAA Journal of Automatica Sinica, 2022, 9 (09) : 1673 - 1686
  • [9] Efficient Exploration for Multi-Agent Reinforcement Learning via Transferable Successor Features
    Liu, Wenzhang
    Dong, Lu
    Niu, Dan
    Sun, Changyin
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2022, 9 (09) : 1673 - 1686
  • [10] An efficient and robust gradient reinforcement learning: Deep comparative policy
    Wang, Jiaguo
    Li, Wenheng
    Lei, Chao
    Yang, Meng
    Pei, Yang
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (02) : 3773 - 3788