Constrained Multiagent Markov Decision Processes: a Taxonomy of Problems and Algorithms

被引:0
|
作者
de Nijs, Frits [1 ]
Walraven, Erwin [2 ]
de Weerdt, Mathijs M. [2 ]
Spaan, Matthijs T. J. [2 ]
机构
[1] Monash Univ, Fac IT, Dept Data Sci & AI, 20 Exhibit Walk, Clayton, Vic 3168, Australia
[2] Delft Univ Technol, Van Mourik Broekmanweg 6, NL-2628 XE Delft, Netherlands
关键词
OPTIMAL POLICIES; COMPLEXITY; CHAINS; AGENTS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In domains such as electric vehicle charging, smart distribution grids and autonomous warehouses, multiple agents share the same resources. When planning the use of these resources, agents need to deal with the uncertainty in these domains. Although several models and algorithms for such constrained multiagent planning problems under uncertainty have been proposed in the literature, it remains unclear when which algorithm can be applied. In this survey we conceptualize these domains and establish a generic problem class based on Markov decision processes. We identify and compare the conditions under which algorithms from the planning literature for problems in this class can be applied: whether constraints are soft or hard, whether agents are continuously connected, whether the domain is fully observable, whether a constraint is momentarily (instantaneous) or on a budget, and whether the constraint is on a single resource or on multiple. Further we discuss the advantages and disadvantages of these algorithms. We conclude by identifying open problems that are directly related to the conceptualized domains, as well as in adjacent research areas.
引用
收藏
页码:955 / 1001
页数:47
相关论文
共 50 条
  • [41] Hierarchical Method for Cooperative Multiagent Reinforcement Learning in Markov Decision Processes
    Bolshakov, V. E.
    Alfimtsev, A. N.
    DOKLADY MATHEMATICS, 2023, 108 (SUPPL 2) : S382 - S392
  • [42] Hierarchical Method for Cooperative Multiagent Reinforcement Learning in Markov Decision Processes
    V. E. Bolshakov
    A. N. Alfimtsev
    Doklady Mathematics, 2023, 108 : S382 - S392
  • [43] Policy gradient Stochastic approximation algorithms for adaptive control of constrained time varying Markov decision processes
    Abad, FJV
    Krishnamurthy, V
    42ND IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-6, PROCEEDINGS, 2003, : 2823 - 2828
  • [44] Distributed Markov Chain Redesign for Multiagent Decision-Making Problems
    Oliva, Gabriele
    Setola, Roberto
    Gasparri, Andrea
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (02) : 1288 - 1295
  • [45] Verification of Markov Decision Processes Using Learning Algorithms
    Brazdil, Tomas
    Chatterjee, Krishnendu
    Chmelik, Martin
    Forejt, Vojtech
    Kretinsky, Jan
    Kwiatkowska, Marta
    Parker, David
    Ujma, Mateusz
    AUTOMATED TECHNOLOGY FOR VERIFICATION AND ANALYSIS, ATVA 2014, 2014, 8837 : 98 - 114
  • [46] Hierarchical algorithms for discounted and weighted Markov decision processes
    Abbad, M
    Daoui, C
    MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2003, 58 (02) : 237 - 245
  • [47] On some algorithms for limiting average Markov decision processes
    Daoui, C.
    Abbad, M.
    OPERATIONS RESEARCH LETTERS, 2007, 35 (02) : 261 - 266
  • [48] Improved Algorithms for Misspecified Linear Markov Decision Processes
    Vial, Daniel
    Parulekar, Advait
    Shakkottai, Sanjay
    Srikant, R.
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [49] Learning algorithms or Markov decision processes with average cost
    Abounadi, J
    Bertsekas, D
    Borkar, VS
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2001, 40 (03) : 681 - 698
  • [50] Combining Learning Algorithms: An Approach to Markov Decision Processes
    Ribeiro, Richardson
    Favarim, Fabio
    Barbosa, Marco A. C.
    Koerich, Alessandro L.
    Enembreck, Fabricio
    ENTERPRISE INFORMATION SYSTEMS, ICEIS 2012, 2013, 141 : 172 - 188