A Reinforcement Learning Method for Continuous Domains Using Artificial Hydrocarbon Networks

被引:0
|
作者
Ponce, Hiram [1 ]
Gonzalez-Mora, Guillermo [1 ]
Martinez-Villasenor, Lourdes [1 ]
机构
[1] Univ Panamer, Fac Ingn, Augusto Rodin 498, Ciudad De Mexico 03920, Mexico
关键词
reinforcement learning; artificial hydrocarbon networks; artificial organic networks; continuous domain; policy search;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning in continuous states and actions has been limitedly studied in ocassions given difficulties in the determination of the transition function, lack of performance in continuous-to-discrete relaxation problems, among others. For instance, real-world problems, e.g. robotics, require these methods for learning complex tasks. Thus, in this paper, we propose a method for reinforcement learning with continuous states and actions using a model-based approach learned with artificial hydrocarbon networks (AHN). The proposed method considers modeling the dynamics of the continuous task with the supervised AHN method. Initial random rollouts and posterior data collection from policy evaluation improve the training of the AHN-based dynamics model. Preliminary results over the well-known mountain car task showed that artificial hydrocarbon networks can contribute to model-based approaches in continuous RL problems in both estimation efficiency (0.0012 in root mean squared-error) and sub-optimal policy convergence (reached in 357 steps), in just 5 trials over a parameter space theta is an element of R-86. Data from experimental results are available at: http://sites. google.com/up.edu.mx/reinforcement learning/.
引用
收藏
页码:398 / 403
页数:6
相关论文
共 50 条
  • [31] Cancer Diagnosis Based on Combination of Artificial Neural Networks and Reinforcement Learning
    Simin, Amir Toranj
    Baygi, Seyed Mohsen Ghorabi
    Noori, Amin
    2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,
  • [32] PROCESS-CONTROL VIA ARTIFICIAL NEURAL NETWORKS AND REINFORCEMENT LEARNING
    HOSKINS, JC
    HIMMELBLAU, DM
    COMPUTERS & CHEMICAL ENGINEERING, 1992, 16 (04) : 241 - 251
  • [33] Transfer Learning for Reinforcement Learning Domains: A Survey
    Taylor, Matthew E.
    Stone, Peter
    JOURNAL OF MACHINE LEARNING RESEARCH, 2009, 10 : 1633 - 1685
  • [34] A generic intelligent routing method using deep reinforcement learning with graph neural networks
    Huang, Wanwei
    Yuan, Bo
    Wang, Sunan
    Zhang, Jianwei
    Li, Junfei
    Zhang, Xiaohui
    IET COMMUNICATIONS, 2022, 16 (19) : 2343 - 2351
  • [35] Reinforcement Learning in Continuous Spaces by Using Learning Fuzzy Classifier Systems
    Chen, Gang
    Douch, Colin
    Zhang, Mengjie
    Pang, Shaoning
    NEURAL INFORMATION PROCESSING, PT II, 2015, 9490 : 320 - 328
  • [36] Learning Continuous Time Bayesian Networks in Non-stationary Domains
    Villa, Simone
    Stella, Fabio
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2016, 57 : 1 - 37
  • [37] Convergent Reinforcement Learning Control with Neural Networks and Continuous Action Search
    Lee, Minwoo
    Anderson, Charles W.
    2014 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING (ADPRL), 2014, : 33 - 40
  • [38] Continuous-time Markov Decision Process with Average Reward: Using Reinforcement Learning Method
    Jia, Shengde
    Shen, Lincheng
    Xue, Hongtao
    2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, : 3097 - 3100
  • [39] Fast learning artificial neural networks for continuous input applications
    Evans, DJ
    Tay, LP
    KYBERNETES, 1995, 24 (03) : 11 - &
  • [40] Inverse Reinforcement Learning in Relational Domains
    Munzer, Thibaut
    Piot, Bilal
    Geist, Matthieu
    Pietquin, Olivier
    Lopes, Manuel
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 3735 - 3741