Scaled free-energy based reinforcement learning for robust and efficient learning in high-dimensional state spaces

被引:8
|
作者
Elfwing, Stefan [1 ]
Uchibe, Eiji [1 ]
Doya, Kenji [1 ]
机构
[1] Grad Univ, Okinawa Inst Sci & Technol, Neural Computat Unit, Onna Son, Okinawa 9040412, Japan
来源
关键词
reinforcement learning; free-energy; restricted Boltzmann machine; robot navigation; function approximation; SPATIAL COGNITION; NAVIGATION; MODEL;
D O I
10.3389/fnbot.2013.00003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Free energy based reinforcement learning (FERL) was proposed for learning in high-dimensional state- and action spaces, which cannot be handled by standard function approximation methods. In this study, we propose a scaled version of free-energy based reinforcement learning to achieve more robust and more efficient learning performance. The action value function is approximated by the negative free energy of a restricted Boltzmann machine, divided by a constant scaling factor that is related to the size of the Boltzmann machine (the square root of the number of state nodes in this study). Our first task is a digit floor gridworld task, where the states are represented by images of handwritten digits from the MNIST data set. The purpose of the task is to investigate the proposed method's ability, through the extraction of task relevant features in the hidden layer, to cluster images of the same digit and to cluster images of different digits that corresponds to states with the same optimal action. We also test the method's robustness with respect to different exploration schedules, i.e., different settings of the initial temperature and the temperature discount rate in softmax action selection. Our second task is a robot visual navigation task, where the robot can learn its position by the different colors of the lower part of four landmarks and it can infer the correct corner goal area by the color of the upper part of the landmarks. The state space consists of binarized camera images with, at most, nine different colors, which is equal to 6642 binary states. For both tasks, the learning performance is compared with standard FERL and with function approximation where the action-value function is approximated by a two-layered feedforward neural network.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Robust And Efficient High-Dimensional Quantum State Tomography
    Rambach, Markus
    Qaryan, Mandi
    Kewming, Michael
    Ferrie, Christopher
    White, Andrew G.
    Romero, Jacquiline
    2021 CONFERENCE ON LASERS AND ELECTRO-OPTICS EUROPE & EUROPEAN QUANTUM ELECTRONICS CONFERENCE (CLEO/EUROPE-EQEC), 2021,
  • [22] Robust and Efficient High-Dimensional Quantum State Tomography
    Rambach, Markus
    Qaryan, Mahdi
    Kewming, Michael
    Ferrie, Christopher
    White, Andrew G.
    Romero, Jacquiline
    PHYSICAL REVIEW LETTERS, 2021, 126 (10)
  • [23] Efficient Learning on High-dimensional Operational Data
    Samani, Forough Shahab
    Zhang, Hongyi
    Stadler, Rolf
    2019 15TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM), 2019,
  • [24] Biologically inspired incremental learning for high-dimensional spaces
    Gepperth, Alexander
    Hecht, Thomas
    Lefort, Mathieu
    Koerner, Ursula
    5TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND ON EPIGENETIC ROBOTICS (ICDL-EPIROB), 2015, : 269 - 275
  • [25] Feature Selection and Feature Learning for High-dimensional Batch Reinforcement Learning: A Survey
    Liu, De-Rong
    Li, Hong-Liang
    Wang, Ding
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2015, 12 (03) : 229 - 242
  • [26] Learning Energy-Based Models in High-Dimensional Spaces with Multiscale Denoising-Score Matching
    Li, Zengyi
    Chen, Yubei
    Sommer, Friedrich T.
    ENTROPY, 2023, 25 (10)
  • [27] Learning latent representations in high-dimensional state spaces using polynomial manifold constructions
    Geelen, Rudy
    Balzano, Laura
    Willcox, Karen
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 4960 - 4965
  • [28] Stochastic Neural Network Approach for Learning High-Dimensional Free Energy Surfaces
    Schneider, Elia
    Dai, Luke
    Topper, Robert Q.
    Drechsel-Grau, Christof
    Tuckerman, Mark E.
    PHYSICAL REVIEW LETTERS, 2017, 119 (15)
  • [29] High-Dimensional Stock Portfolio Trading with Deep Reinforcement Learning
    Pigorsch, Uta
    Schaefer, Sebastian
    2022 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE FOR FINANCIAL ENGINEERING AND ECONOMICS (CIFER), 2022,
  • [30] Optimizing high-dimensional stochastic forestry via reinforcement learning
    Tahvonen, Olli
    Suominen, Antti
    Malo, Pekka
    Viitasaari, Lauri
    Parkatti, Vesa-Pekka
    JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2022, 145