Fast and slow curiosity for high-level exploration in reinforcement learning

被引：16

作者：

Bougie, Nicolas ^{[1
,2
]}

Ichise, Ryutaro ^{[1
,2
]}

机构：

[1] Natl Inst Informat, Tokyo, Japan

[2] Grad Univ Adv Studies, Sokendai, Tokyo, Japan

来源：

APPLIED INTELLIGENCE | 2021年 / 51卷 / 02期

关键词：

Reinforcement learning; Exploration; Autonomous exploration; Curiosity in reinforcement learning; NETWORKS;

D O I：

10.1007/s10489-020-01849-3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep reinforcement learning (DRL) algorithms rely on carefully designed environment rewards that are extrinsic to the agent. However, in many real-world scenarios rewards are sparse or delayed, motivating the need for discovering efficient exploration strategies. While intrinsically motivated agents hold promise of better local exploration, solving problems that require coordinated decisions over long-time horizons remains an open problem. We postulate that to discover such strategies, a DRL agent should be able to combine local and high-level exploration behaviors. To this end, we introduce the concept of fast and slow curiosity that aims to incentivize long-time horizon exploration. Our method decomposes the curiosity bonus into a fast reward that deals with local exploration and a slow reward that encourages global exploration. We formulate this bonus as the error in an agent's ability to reconstruct the observations given their contexts. We further propose to dynamically weight local and high-level strategies by measuring state diversity. We evaluate our method on a variety of benchmark environments, including Minigrid, Super Mario Bros, and Atari games. Experimental results show that our agent outperforms prior approaches in most tasks in terms of exploration efficiency and mean scores.

引用

页码：1086 / 1107

页数：22

共 50 条

[1] Fast and slow curiosity for high-level exploration in reinforcement learning
Nicolas Bougie
Ryutaro Ichise
Applied Intelligence, 2021, 51 : 1086 - 1107
[2] Towards High-Level Intrinsic Exploration in Reinforcement Learning
Bougie, Nicolas
Ichise, Ryutaro
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 5186 - 5187
[3] Curiosity-driven Exploration in Reinforcement Learning
Gregor, Michael d
Spalek, Juraj
2014 ELEKTRO, 2014, : 435 - 440
[4] Fast and Inexpensive High-Level Synthesis Design Space Exploration: Machine Learning to the Rescue
Rashid, Md Imtiaz
Schafer, Benjamin Carrion
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (11) : 3939 - 3950
[5] Reinforcement Learning, Fast and Slow
Botvinick, Matthew
Ritter, Sam
Wang, Jane X.
Kurth-Nelson, Zeb
Blundell, Charles
Hassabis, Demis
TRENDS IN COGNITIVE SCIENCES, 2019, 23 (05) : 408 - 422
[6] Reinforcement learning for instance segmentation with high-level priors
Hilt, Paul
Zarvandi, Maedeh
Kaziakhmedov, Edgar
Bhide, Sourabh
Laptin, Maria
Pape, Constantin
Kreshuk, Anna
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 3915 - 3924
[7] Learning to explore by reinforcement over high-level options
Juncheng Liu
Brendan McCane
Steven Mills
Machine Vision and Applications, 2024, 35
[8] Learning to explore by reinforcement over high-level options
Liu, Juncheng
Mccane, Brendan
Mills, Steven
MACHINE VISION AND APPLICATIONS, 2024, 35 (01)
[9] Reinforcement learning for high-level fuzzy Petri nets
Shen, VRL
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2003, 33 (02): : 351 - 362
[10] A fast exploration procedure for analog high-level specification translation
Pandit, Soumya
Bhattacharya, Sumit K.
Mandal, Chittaranjan
Patra, Amit
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2008, 27 (08) : 1493 - 1497

← 1 2 3 4 5 →