A comparison of learning performance in two-dimensional Q-learning by the difference of Q-values alignment

被引:0
|
作者
Kathy Thi Aung
Takayasu Fuchida
机构
[1] Kagoshima University,Department of Information and Computer Science, Graduate School of Science and Engineering
关键词
Q-learning; Q-value; Voronoi Q-value element; Single agent; State space;
D O I
10.1007/s10015-011-0961-5
中图分类号
学科分类号
摘要
In this article, we examine the learning performance of various strategies under different conditions using the Voronoi Q-value element (VQE) based on reward in a single-agent environment, and decide how to act in a certain state. In order to test our hypotheses, we performed computational experiments using several situations such as various angles of rotation of VQEs which are arranged into a lattice structure, various angles of an agent’s action rotation that has 4 actions, and a random arrangement of VQEs to correctly evaluate the optimal Q-values for state and action pairs in order to deal with continuous-valued inputs. As a result, the learning performance changes when the angle of VQEs and the angle of action are changed by a specific relative position.
引用
收藏
页码:473 / 477
页数:4
相关论文
共 50 条
  • [31] Neural Q-learning
    ten Hagen, S
    Kröse, B
    NEURAL COMPUTING & APPLICATIONS, 2003, 12 (02): : 81 - 88
  • [32] Logistic Q-Learning
    Bas-Serrano, Joan
    Curi, Sebastian
    Krause, Andreas
    Neu, Gergely
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [33] Enhancing Nash Q-learning and Team Q-learning mechanisms by using bottlenecks
    Ghazanfari, Behzad
    Mozayani, Nasser
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2014, 26 (06) : 2771 - 2783
  • [34] Control with adaptive Q-learning: A comparison for two classical control problems
    Araujo, Joao Pedro
    Figueiredo, Mario A. T.
    Botto, Miguel Ayala
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 112
  • [35] Performance Investigation of UCB Policy in Q-Learning
    Saito, Koki
    Notsu, Akira
    Ubukata, Seiki
    Honda, Katsuhiro
    2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2015, : 777 - 780
  • [36] Multi Q-Table Q-Learning
    Kantasewi, Nitchakun
    Marukatat, Sanparith
    Thainimit, Somying
    Manabu, Okumura
    2019 10TH INTERNATIONAL CONFERENCE OF INFORMATION AND COMMUNICATION TECHNOLOGY FOR EMBEDDED SYSTEMS (IC-ICTES), 2019,
  • [37] An Online Home Energy Management System using Q-Learning and Deep Q-Learning
    Izmitligil, Hasan
    Karamancioglu, Abdurrahman
    SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2024, 43
  • [38] Using free energies to represent Q-values in a multiagent reinforcement learning task
    Sallans, B
    Hinton, GE
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 13, 2001, 13 : 1075 - 1081
  • [39] Temporal-Difference Q-learning in Active Fault Diagnosis
    Skach, Jan
    Puncochar, Ivo
    Lewis, Frank L.
    2016 3RD CONFERENCE ON CONTROL AND FAULT-TOLERANT SYSTEMS (SYSTOL), 2016, : 287 - 292
  • [40] Two-level Q-learning: learning from conflict demonstrations
    Li, Mao
    Wei, Yi
    Kudenko, Daniel
    KNOWLEDGE ENGINEERING REVIEW, 2019, 34