Fast and slow curiosity for high-level exploration in reinforcement learning

被引:16
|
作者
Bougie, Nicolas [1 ,2 ]
Ichise, Ryutaro [1 ,2 ]
机构
[1] Natl Inst Informat, Tokyo, Japan
[2] Grad Univ Adv Studies, Sokendai, Tokyo, Japan
关键词
Reinforcement learning; Exploration; Autonomous exploration; Curiosity in reinforcement learning; NETWORKS;
D O I
10.1007/s10489-020-01849-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep reinforcement learning (DRL) algorithms rely on carefully designed environment rewards that are extrinsic to the agent. However, in many real-world scenarios rewards are sparse or delayed, motivating the need for discovering efficient exploration strategies. While intrinsically motivated agents hold promise of better local exploration, solving problems that require coordinated decisions over long-time horizons remains an open problem. We postulate that to discover such strategies, a DRL agent should be able to combine local and high-level exploration behaviors. To this end, we introduce the concept of fast and slow curiosity that aims to incentivize long-time horizon exploration. Our method decomposes the curiosity bonus into a fast reward that deals with local exploration and a slow reward that encourages global exploration. We formulate this bonus as the error in an agent's ability to reconstruct the observations given their contexts. We further propose to dynamically weight local and high-level strategies by measuring state diversity. We evaluate our method on a variety of benchmark environments, including Minigrid, Super Mario Bros, and Atari games. Experimental results show that our agent outperforms prior approaches in most tasks in terms of exploration efficiency and mean scores.
引用
收藏
页码:1086 / 1107
页数:22
相关论文
共 50 条
  • [1] Fast and slow curiosity for high-level exploration in reinforcement learning
    Nicolas Bougie
    Ryutaro Ichise
    Applied Intelligence, 2021, 51 : 1086 - 1107
  • [2] Towards High-Level Intrinsic Exploration in Reinforcement Learning
    Bougie, Nicolas
    Ichise, Ryutaro
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 5186 - 5187
  • [3] Curiosity-driven Exploration in Reinforcement Learning
    Gregor, Michael d
    Spalek, Juraj
    2014 ELEKTRO, 2014, : 435 - 440
  • [4] Fast and Inexpensive High-Level Synthesis Design Space Exploration: Machine Learning to the Rescue
    Rashid, Md Imtiaz
    Schafer, Benjamin Carrion
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (11) : 3939 - 3950
  • [5] Reinforcement Learning, Fast and Slow
    Botvinick, Matthew
    Ritter, Sam
    Wang, Jane X.
    Kurth-Nelson, Zeb
    Blundell, Charles
    Hassabis, Demis
    TRENDS IN COGNITIVE SCIENCES, 2019, 23 (05) : 408 - 422
  • [6] Reinforcement learning for instance segmentation with high-level priors
    Hilt, Paul
    Zarvandi, Maedeh
    Kaziakhmedov, Edgar
    Bhide, Sourabh
    Laptin, Maria
    Pape, Constantin
    Kreshuk, Anna
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 3915 - 3924
  • [7] Learning to explore by reinforcement over high-level options
    Juncheng Liu
    Brendan McCane
    Steven Mills
    Machine Vision and Applications, 2024, 35
  • [8] Learning to explore by reinforcement over high-level options
    Liu, Juncheng
    Mccane, Brendan
    Mills, Steven
    MACHINE VISION AND APPLICATIONS, 2024, 35 (01)
  • [9] Reinforcement learning for high-level fuzzy Petri nets
    Shen, VRL
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2003, 33 (02): : 351 - 362
  • [10] A fast exploration procedure for analog high-level specification translation
    Pandit, Soumya
    Bhattacharya, Sumit K.
    Mandal, Chittaranjan
    Patra, Amit
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2008, 27 (08) : 1493 - 1497