Distributional Reward Decomposition for Reinforcement Learning

被引:0
|
作者
Lin, Zichuan [1 ,2 ]
Zhao, Li [2 ]
Yang, Derek [3 ]
Qin, Tao [2 ]
Yang, Guangwen [1 ]
Liu, Tie-Yan [2 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Microsoft Res, Redmond, WA USA
[3] Univ Calif San Diego, La Jolla, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many reinforcement learning (RL) tasks have specific properties that can be lever-aged to modify existing RL algorithms to adapt to those tasks and further improve performance, and a general class of such properties is the multiple reward channel. In those environments the full reward can be decomposed into sub-rewards obtained from different channels. Existing work on reward decomposition either requires prior knowledge of the environment to decompose the full reward, or decomposes reward without prior knowledge but with degraded performance. In this paper, we propose Distributional Reward Decomposition for Reinforcement Learning (DRDRL), a novel reward decomposition algorithm which captures the multiple reward channel structure under distributional setting. Empirically, our method captures the multi-channel structure and discovers meaningful reward decomposition, without any requirements on prior knowledge. Consequently, our agent achieves better performance than existing methods on environments with multiple reward channels.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning
    Icarte R.T.
    Klassen T.Q.
    Valenzano R.
    McIlraith S.A.
    Journal of Artificial Intelligence Research, 2022, 73 : 173 - 208
  • [42] Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning
    Icarte, Rodrigo Toro
    Klassen, Toryn Q.
    Valenzano, Richard
    Mcllraith, Sheila A.
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 73 : 173 - 208
  • [43] Biologically inspired reinforcement learning: Reward-based decomposition for multi-goal environments
    Zhou, WD
    Coggins, R
    BIOLOGICALLY INSPIRED APPROACHES TO ADVANCED INFORMATION TECHNOLOGY, 2004, 3141 : 80 - 94
  • [44] Multi-Ship Dynamic Weapon-Target Assignment via Cooperative Distributional Reinforcement Learning With Dynamic Reward
    Peng, Zhe
    Lu, Zhifeng
    Mao, Xiao
    Ye, Feng
    Huang, Kuihua
    Wu, Guohua
    Wang, Ling
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
  • [45] Actively learning costly reward functions for reinforcement learning
    Eberhard, Andre
    Metni, Houssam
    Fahland, Georg
    Stroh, Alexander
    Friederich, Pascal
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (01):
  • [46] Learning classifier system with average reward reinforcement learning
    Zang, Zhaoxiang
    Li, Dehua
    Wang, Junying
    Xia, Dan
    KNOWLEDGE-BASED SYSTEMS, 2013, 40 : 58 - 71
  • [47] Reinforcement Learning for Data Preparation with Active Reward Learning
    Berti-Equille, Laure
    INTERNET SCIENCE, INSCI 2019, 2019, 11938 : 121 - 132
  • [48] Active Learning for Reward Estimation in Inverse Reinforcement Learning
    Lopes, Manuel
    Melo, Francisco
    Montesano, Luis
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2009, 5782 : 31 - +
  • [49] Learning Reward Machines for Partially Observable Reinforcement Learning
    Icarte, Rodrigo Toro
    Waldie, Ethan
    Klassen, Toryn Q.
    Valenzano, Richard
    Castro, Margarita P.
    McIlraith, Sheila A.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [50] A Distributional Perspective on Multiagent Cooperation With Deep Reinforcement Learning
    Huang, Liwei
    Fu, Mingsheng
    Rao, Ananya
    Irissappane, Athirai A.
    Zhang, Jie
    Xu, Chengzhong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 4246 - 4259