Reinforcement distribution in a team of cooperative Q-learning agents

被引:4
|
作者
Abbasi, Zahra [1 ]
Abbasi, Mohammad Ali [2 ]
机构
[1] Islamic Azad Univ, Parand Branch, Tehran, Iran
[2] Univ Tehran, Fac Engn, Dept Elect & Comp Engn, Tehran 14174, Iran
关键词
agent learning; evolution; and adaptation; multiagent systems; cooperative distributed problem solving; coordination; cooperation; and teamwork; multiagent learning;
D O I
10.1109/SNPD.2008.154
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In a Q-learning multi-agent group, agents cooperate each other to perform their assigned task during their learning for increasing the team performance. If the role of each agent clearly specified-which is a very hard task for a supervisor agent- the team will learn more efficiently. Indeed, in this cage each agent reinforced according to its real effect on the team Performance. Assuming an identical role for all agents is the most prevalent technique of current researchers to escape the modeling complexities. But we believe this is not the optimum method for reinforcement distribution. The main goal of this research is to find an indirect evaluation method which evaluates the role of each agent in the team and distributes the reinforcement signal accordingly. The expertness of each agent is used as a criterion to estimate the effect of each agent's action on the team performance. Random and equal reinforcement signal distribution methods are also used in order to evaluate expertness-based reinforcement sharing. In addition, a new test bed, called EPIDEM, is developed to evaluate the proposed methods. The results show, the distribution of the reinforcement signals based on the proposed method improves the team learning speed.
引用
收藏
页码:154 / +
页数:3
相关论文
共 50 条
  • [31] Deep Reinforcement Learning with Sarsa and Q-Learning: A Hybrid Approach
    Xu, Zhi-xiong
    Cao, Lei
    Chen, Xi-liang
    Li, Chen-xi
    Zhang, Yong-liang
    Lai, Jun
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (09) : 2315 - 2322
  • [32] Constraints Penalized Q-learning for Safe Offline Reinforcement Learning
    Xu, Haoran
    Zhan, Xianyuan
    Zhu, Xiangyu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8753 - 8760
  • [33] Reinforcement distribution in continuous state action space fuzzy Q-learning: A novel approach
    Bonarini, A
    Montrone, F
    Restelli, M
    FUZZY LOGIC AND APPLICATIONS, 2006, 3849 : 40 - 45
  • [34] Training and delayed reinforcements in Q-learning agents
    Caironi, PVC
    Dorigo, M
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 1997, 12 (10) : 695 - 724
  • [35] Q-learning agents in a Cournot oligopoly model
    Waltman, Ludo
    Kaymak, Uzay
    JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2008, 32 (10): : 3275 - 3293
  • [36] Q-Learning Transformation for Training on JADE Agents
    Cepero-Perez, Nayma
    Moreno-Espino, Mailyn
    REVISTA DIGITAL LAMPSAKOS, 2015, (14): : 25 - 32
  • [37] Multiagent Q-learning with Sub-Team Coordination
    Huang, Wenhan
    Li, Kai
    Shao, Kun
    Zhou, Tianze
    Taylor, Matthew E.
    Luo, Jun
    Wang, Dongge
    Mao, Hangyu
    Hao, Jianye
    Wang, Jun
    Deng, Xiaotie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [38] Q-learning as a model of utilitarianism in a human–machine team
    Samantha Krening
    Neural Computing and Applications, 2023, 35 : 16853 - 16864
  • [39] The Sample Complexity of Teaching-by-Reinforcement on Q-Learning
    Zhang, Xuezhou
    Bharti, Shubham Kumar
    Ma, Yuzhe
    Singla, Adish
    Zhu, Xiaojin
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10939 - 10947
  • [40] Expected Lenient Q-learning: a fast variant of the Lenient Q-learning algorithm for cooperative stochastic Markov games
    Amhraoui, Elmehdi
    Masrour, Tawfik
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (07) : 2781 - 2797