Reinforcement distribution in a team of cooperative Q-learning agents

被引：4

作者：

Abbasi, Zahra ^{[1
]}

Abbasi, Mohammad Ali ^{[2
]}

机构：

[1] Islamic Azad Univ, Parand Branch, Tehran, Iran

[2] Univ Tehran, Fac Engn, Dept Elect & Comp Engn, Tehran 14174, Iran

来源：

PROCEEDINGS OF NINTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING | 2008年

关键词：

agent learning; evolution; and adaptation; multiagent systems; cooperative distributed problem solving; coordination; cooperation; and teamwork; multiagent learning;

D O I：

10.1109/SNPD.2008.154

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In a Q-learning multi-agent group, agents cooperate each other to perform their assigned task during their learning for increasing the team performance. If the role of each agent clearly specified-which is a very hard task for a supervisor agent- the team will learn more efficiently. Indeed, in this cage each agent reinforced according to its real effect on the team Performance. Assuming an identical role for all agents is the most prevalent technique of current researchers to escape the modeling complexities. But we believe this is not the optimum method for reinforcement distribution. The main goal of this research is to find an indirect evaluation method which evaluates the role of each agent in the team and distributes the reinforcement signal accordingly. The expertness of each agent is used as a criterion to estimate the effect of each agent's action on the team performance. Random and equal reinforcement signal distribution methods are also used in order to evaluate expertness-based reinforcement sharing. In addition, a new test bed, called EPIDEM, is developed to evaluate the proposed methods. The results show, the distribution of the reinforcement signals based on the proposed method improves the team learning speed.

引用

页码：154 / +

页数：3

共 50 条

[31] Deep Reinforcement Learning with Sarsa and Q-Learning: A Hybrid Approach
Xu, Zhi-xiong
Cao, Lei
Chen, Xi-liang
Li, Chen-xi
Zhang, Yong-liang
Lai, Jun
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (09) : 2315 - 2322
[32] Constraints Penalized Q-learning for Safe Offline Reinforcement Learning
Xu, Haoran
Zhan, Xianyuan
Zhu, Xiangyu
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8753 - 8760
[33] Reinforcement distribution in continuous state action space fuzzy Q-learning: A novel approach
Bonarini, A
Montrone, F
Restelli, M
FUZZY LOGIC AND APPLICATIONS, 2006, 3849 : 40 - 45
[34] Training and delayed reinforcements in Q-learning agents
Caironi, PVC
Dorigo, M
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 1997, 12 (10) : 695 - 724
[35] Q-learning agents in a Cournot oligopoly model
Waltman, Ludo
Kaymak, Uzay
JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2008, 32 (10): : 3275 - 3293
[36] Q-Learning Transformation for Training on JADE Agents
Cepero-Perez, Nayma
Moreno-Espino, Mailyn
REVISTA DIGITAL LAMPSAKOS, 2015, (14): : 25 - 32
[37] Multiagent Q-learning with Sub-Team Coordination
Huang, Wenhan
Li, Kai
Shao, Kun
Zhou, Tianze
Taylor, Matthew E.
Luo, Jun
Wang, Dongge
Mao, Hangyu
Hao, Jianye
Wang, Jun
Deng, Xiaotie
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[38] Q-learning as a model of utilitarianism in a human–machine team
Samantha Krening
Neural Computing and Applications, 2023, 35 : 16853 - 16864
[39] The Sample Complexity of Teaching-by-Reinforcement on Q-Learning
Zhang, Xuezhou
Bharti, Shubham Kumar
Ma, Yuzhe
Singla, Adish
Zhu, Xiaojin
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10939 - 10947
[40] Expected Lenient Q-learning: a fast variant of the Lenient Q-learning algorithm for cooperative stochastic Markov games
Amhraoui, Elmehdi
Masrour, Tawfik
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (07) : 2781 - 2797

← 1 2 3 4 5 →