A comparison of learning performance in two-dimensional Q-learning by the difference of Q-values alignment

被引：0

作者：

Kathy Thi Aung

Takayasu Fuchida

机构：

[1] Kagoshima University,Department of Information and Computer Science, Graduate School of Science and Engineering

来源：

Artificial Life and Robotics | 2012年 / 16卷 / 4期

关键词：

Q-learning; Q-value; Voronoi Q-value element; Single agent; State space;

D O I：

10.1007/s10015-011-0961-5

中图分类号：

学科分类号：

摘要：

In this article, we examine the learning performance of various strategies under different conditions using the Voronoi Q-value element (VQE) based on reward in a single-agent environment, and decide how to act in a certain state. In order to test our hypotheses, we performed computational experiments using several situations such as various angles of rotation of VQEs which are arranged into a lattice structure, various angles of an agent’s action rotation that has 4 actions, and a random arrangement of VQEs to correctly evaluate the optimal Q-values for state and action pairs in order to deal with continuous-valued inputs. As a result, the learning performance changes when the angle of VQEs and the angle of action are changed by a specific relative position.

引用

页码：473 / 477

页数：4

共 50 条

[31] Neural Q-learning
ten Hagen, S
Kröse, B
NEURAL COMPUTING & APPLICATIONS, 2003, 12 (02): : 81 - 88
[32] Logistic Q-Learning
Bas-Serrano, Joan
Curi, Sebastian
Krause, Andreas
Neu, Gergely
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[33] Enhancing Nash Q-learning and Team Q-learning mechanisms by using bottlenecks
Ghazanfari, Behzad
Mozayani, Nasser
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2014, 26 (06) : 2771 - 2783
[34] Control with adaptive Q-learning: A comparison for two classical control problems
Araujo, Joao Pedro
Figueiredo, Mario A. T.
Botto, Miguel Ayala
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 112
[35] Performance Investigation of UCB Policy in Q-Learning
Saito, Koki
Notsu, Akira
Ubukata, Seiki
Honda, Katsuhiro
2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2015, : 777 - 780
[36] Multi Q-Table Q-Learning
Kantasewi, Nitchakun
Marukatat, Sanparith
Thainimit, Somying
Manabu, Okumura
2019 10TH INTERNATIONAL CONFERENCE OF INFORMATION AND COMMUNICATION TECHNOLOGY FOR EMBEDDED SYSTEMS (IC-ICTES), 2019,
[37] An Online Home Energy Management System using Q-Learning and Deep Q-Learning
Izmitligil, Hasan
Karamancioglu, Abdurrahman
SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2024, 43
[38] Using free energies to represent Q-values in a multiagent reinforcement learning task
Sallans, B
Hinton, GE
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 13, 2001, 13 : 1075 - 1081
[39] Temporal-Difference Q-learning in Active Fault Diagnosis
Skach, Jan
Puncochar, Ivo
Lewis, Frank L.
2016 3RD CONFERENCE ON CONTROL AND FAULT-TOLERANT SYSTEMS (SYSTOL), 2016, : 287 - 292
[40] Two-level Q-learning: learning from conflict demonstrations
Li, Mao
Wei, Yi
Kudenko, Daniel
KNOWLEDGE ENGINEERING REVIEW, 2019, 34

← 1 2 3 4 5 →