Generalization to New Actions in Reinforcement Learning

被引:0
|
作者
Jain, Ayush [1 ]
Szot, Andrew [1 ]
Lim, Joseph J. [1 ]
机构
[1] Univ Southern Calif, Dept Comp Sci, Los Angeles, CA 90089 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A fundamental trait of intelligence is the ability to achieve goals in the face of novel circumstances, such as making decisions from new action choices. However, standard reinforcement learning assumes a fixed set of actions and requires expensive retraining when given a new action set. To make learning agents more adaptable, we introduce the problem of zero-shot generalization to new actions. We propose a two-stage framework where the agent first infers action representations from action information acquired separately from the task. A policy flexible to varying action sets is then trained with generalization objectives. We benchmark generalization on sequential tasks, such as selecting from an unseen tool-set to solve physical reasoning puzzles and stacking towers with novel 3D shapes. Videos and code are available at https://sites.google.com/view/action-generalization.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Learning More Complex Actions with Deep Reinforcement Learning
    Wang, Chenxi
    Du, Youtian
    Xie, Shengyuan
    Lu, Yongdi
    2021 FIFTH IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING (IRC 2021), 2021, : 121 - 122
  • [32] Heuristic Selection of Actions in Multiagent Reinforcement Learning
    Bianchi, Reinaldo A. C.
    Ribeiro, Carlos H. C.
    Costa, Anna H. R.
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 690 - 695
  • [33] Reusability and Transferability of Macro Actions for Reinforcement Learning
    Chang Y.-H.
    Chang K.-Y.
    Kuo H.
    Lee C.-Y.
    ACM Transactions on Evolutionary Learning and Optimization, 2022, 2 (01):
  • [34] An Efficient Reinforcement Learning Algorithm for Continuous Actions
    Fu Bo
    Chen Xin
    He Yong
    Wu Min
    2013 25TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2013, : 80 - 85
  • [35] Delayed reinforcement learning of multidimensional control actions
    Cichosz, P.
    Systems Analysis Modelling Simulation, 1996, 24 (1-3): : 233 - 248
  • [36] Pyramid Representations of the Set of Actions in Reinforcement Learning
    Iglesias, R.
    Alvarez-Santos, V.
    Rodriguez, M. A.
    Santos-Saavedra, D.
    Regueiro, C. V.
    Pardo, X. M.
    BIOINSPIRED COMPUTATION IN ARTIFICIAL SYSTEMS, PT II, 2015, 9108 : 203 - 212
  • [37] Biologically plausible reinforcement learning of continuous actions
    Jaldert O Rombouts
    Pieter R Roelfsema
    Sander M Bohte
    BMC Neuroscience, 14 (Suppl 1)
  • [38] Hierarchical reinforcement learning based on macro actions
    Hao Jiang
    Gongju Wang
    Shengze Li
    Jieyuan Zhang
    Long Yan
    Xinhai Xu
    Complex & Intelligent Systems, 2025, 11 (6)
  • [39] Using Predictive Representations to Improve Generalization in Reinforcement Learning
    Rafols, Eddie J.
    Ring, Mark B.
    Sutton, Richard S.
    Tanner, Brian
    19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 835 - 840
  • [40] Experience generalization for multi-agent reinforcement learning
    Pegoraro, R
    Costa, AHR
    Ribeiro, CHC
    SCCC 2001: XXI INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY, PROCEEDINGS, 2001, : 233 - 239