PROBING TRANSFER IN DEEP REINFORCEMENT LEARNING WITHOUT TASK ENGINEERING

被引：0

作者：

Rusu, Andrei A. ^{[1
]}

Flennerhag, Sebastian ^{[1
]}

Rao, Dushyant ^{[1
]}

Pascanu, Razvan ^{[1
]}

Hadsell, Raia ^{[1
]}

机构：

[1] DeepMind, London, England

来源：

CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 199 | 2022年 / 199卷

关键词：

GO; ENVIRONMENT; LEVEL; GAME;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We evaluate the use of original game curricula supported by the Atari 2600 console as a heterogeneous transfer benchmark for deep reinforcement learning agents. Game designers created curricula using combinations of several discrete modifications to the basic versions of games such as Space Invaders, Breakout and Freeway, making them progressively more challenging for human players. By formally organising these modifications into several factors of variation, we are able to show that Analyses of Variance (ANOVA) are a potent tool for studying the effects of human-relevant domain changes on the learning and transfer performance of a deep reinforcement learning agent. Since no manual task engineering is needed on our part, leveraging the original multi-factorial design avoids the pitfalls of unintentionally biasing the experimental setup. We find that game design factors have a large and statistically significant impact on an agent's ability to learn, and so do their combinatorial interactions. Furthermore, we show that zero-shot transfer from the basic games to their respective variations is possible, but the variance in performance is also largely explained by interactions between factors. As such, we argue that Atari game curricula offer a challenging benchmark for transfer learning in RL, that can help the community better understand the generalisation capabilities of RL agents along dimensions which meaningfully impact human generalisation performance. As a start, we report that value-function finetuning of regularly trained agents achieves positive transfer in a majority of cases, but significant headroom for algorithmic innovation remains. We conclude with the observation that selective transfer from multiple variants could further improve performance.

引用

页数：24

共 50 条

[21] Deep Reinforcement Learning for Task Planning of Virtual Characters
Souza, Caio
Velhor, Luiz
INTELLIGENT COMPUTING, VOL 2, 2021, 284 : 694 - 711
[22] Deep Reinforcement Learning in robotics logistic task coordination
Chenatti, Samuel
Previato, Gabriel
Cano, Guilherme
Prudencio, Rafael
Leite, Guilherme
Pereira, Wallace da Cruz
Abreu, Guilherme
Oliveira, Thales
Braga, Victor Sorensen
Correa, Guilherme
Colombini, Esther
15TH LATIN AMERICAN ROBOTICS SYMPOSIUM 6TH BRAZILIAN ROBOTICS SYMPOSIUM 9TH WORKSHOP ON ROBOTICS IN EDUCATION (LARS/SBR/WRE 2018), 2018, : 326 - 332
[23] The Impact of Task Underspecification in Evaluating Deep Reinforcement Learning
Jayawardana, Vindula
Tang, Catherine
Li, Sirui
Suo, Dajiang
Wu, Cathy
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[24] Task Scheduling in Cloud Using Deep Reinforcement Learning
Swarup, Shashank
Shakshuki, Elhadi M.
Yasar, Ansar
12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 42 - 51
[25] Task Planning in "Block World" with Deep Reinforcement Learning
Ayunts, Edward
Panov, Alekasndr I.
BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES (BICA) FOR YOUNG SCIENTISTS, 2018, 636 : 3 - 9
[26] Task Allocation for Mobile Crowdsensing with Deep Reinforcement Learning
Tao, Xi
Song, Wei
2020 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2020,
[27] Scheduling conditional task graphs with deep reinforcement learning
Debner, Anton
Krahn, Maximilian
Hirvisalo, Vesa
NORTHERN LIGHTS DEEP LEARNING CONFERENCE, VOL 233, 2024, 233 : 46 - 52
[28] Exploiting Multi-Task Learning to Achieve Effective Transfer Deep Reinforcement Learning in Elastic Optical Networks
Chen, Xiaoliang
Proietti, Roberto
Liu, Che-Yu
Zhu, Zuqing
Ben Yoo, S. J.
2020 OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXPOSITION (OFC), 2020,
[29] Automatic Discovery and Transfer of Task Hierarchies in Reinforcement Learning
Mehta, Neville
Ray, Soumya
Tadepalli, Prasad
Dietterich, Thomas
AI MAGAZINE, 2011, 32 (01) : 35 - 50
[30] DEEP REINFORCEMENT LEARNING FOR TRANSFER OF CONTROL POLICIES
Cunningham, James D.
Miller, Simon W.
Yukish, Michael A.
Simpson, Timothy W.
Tucker, Conrad S.
PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, 2019, VOL 2A, 2020,

← 1 2 3 4 5 →