Generalization to New Actions in Reinforcement Learning

被引：0

作者：

Jain, Ayush ^{[1
]}

Szot, Andrew ^{[1
]}

Lim, Joseph J. ^{[1
]}

机构：

[1] Univ Southern Calif, Dept Comp Sci, Los Angeles, CA 90089 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119 | 2020年 / 119卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A fundamental trait of intelligence is the ability to achieve goals in the face of novel circumstances, such as making decisions from new action choices. However, standard reinforcement learning assumes a fixed set of actions and requires expensive retraining when given a new action set. To make learning agents more adaptable, we introduce the problem of zero-shot generalization to new actions. We propose a two-stage framework where the agent first infers action representations from action information acquired separately from the task. A policy flexible to varying action sets is then trained with generalization objectives. We benchmark generalization on sequential tasks, such as selecting from an unseen tool-set to solve physical reasoning puzzles and stacking towers with novel 3D shapes. Videos and code are available at https://sites.google.com/view/action-generalization.

引用

页数：12

共 50 条

[31] Learning More Complex Actions with Deep Reinforcement Learning
Wang, Chenxi
Du, Youtian
Xie, Shengyuan
Lu, Yongdi
2021 FIFTH IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING (IRC 2021), 2021, : 121 - 122
[32] Heuristic Selection of Actions in Multiagent Reinforcement Learning
Bianchi, Reinaldo A. C.
Ribeiro, Carlos H. C.
Costa, Anna H. R.
20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 690 - 695
[33] Reusability and Transferability of Macro Actions for Reinforcement Learning
Chang Y.-H.
Chang K.-Y.
Kuo H.
Lee C.-Y.
ACM Transactions on Evolutionary Learning and Optimization, 2022, 2 (01):
[34] An Efficient Reinforcement Learning Algorithm for Continuous Actions
Fu Bo
Chen Xin
He Yong
Wu Min
2013 25TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2013, : 80 - 85
[35] Delayed reinforcement learning of multidimensional control actions
Cichosz, P.
Systems Analysis Modelling Simulation, 1996, 24 (1-3): : 233 - 248
[36] Pyramid Representations of the Set of Actions in Reinforcement Learning
Iglesias, R.
Alvarez-Santos, V.
Rodriguez, M. A.
Santos-Saavedra, D.
Regueiro, C. V.
Pardo, X. M.
BIOINSPIRED COMPUTATION IN ARTIFICIAL SYSTEMS, PT II, 2015, 9108 : 203 - 212
[37] Biologically plausible reinforcement learning of continuous actions
Jaldert O Rombouts
Pieter R Roelfsema
Sander M Bohte
BMC Neuroscience, 14 (Suppl 1)
[38] Hierarchical reinforcement learning based on macro actions
Hao Jiang
Gongju Wang
Shengze Li
Jieyuan Zhang
Long Yan
Xinhai Xu
Complex & Intelligent Systems, 2025, 11 (6)
[39] Using Predictive Representations to Improve Generalization in Reinforcement Learning
Rafols, Eddie J.
Ring, Mark B.
Sutton, Richard S.
Tanner, Brian
19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 835 - 840
[40] Experience generalization for multi-agent reinforcement learning
Pegoraro, R
Costa, AHR
Ribeiro, CHC
SCCC 2001: XXI INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY, PROCEEDINGS, 2001, : 233 - 239

← 1 2 3 4 5 →