Multi-Agent Reinforcement Learning with Prospect Theory

被引:0
|
作者
Danis, Dominic [1 ]
Parmacek, Parker [1 ]
Dunajsky, David [1 ]
Ramasubramanian, Bhaskar [1 ]
机构
[1] Western Washington Univ, Elect & Comp Engn, Bellingham, WA 98225 USA
基金
美国国家科学基金会;
关键词
DECISION; RISK;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent advances in cyber and cyber-physical systems have informed the development of scalable and efficient algorithms for these systems to learn behaviors when operating in uncertain and unknown environments. When such systems share their operating environments with human users, such as in autonomous driving, it is important to be able to learn behaviors for each entity in the environment that will (i) recognize presence of other entities, and (ii) be aligned with preferences of one or more human users in the environment. While multiagent reinforcement learning (MARL) provides a modeling, design, and analysis paradigm for (i), there remains a gap in the development of strategies to solve (ii). In this paper, we aim to bridge this gap through the design, analysis, and evaluation of MARL algorithms that recognize preferences of human users. We use cumulative prospect theory (CPT) to model multiple human traits such as a tendency to view gains and losses differently, and to evaluate outcomes relative to a reference point. We define a CPT-based value function, and learn agent policies as a consequence of optimizing this value function. To this end, we develop MA-CPT-Q, a multi-agent CPT-based Q-learning algorithm, and establish its convergence. We adapt this algorithm to a setting where any agent can call upon 'more experienced' agents to aid its own learning process, and propose MA-CPT-Q-WS, a multi-agent CPT-based Q-learning algorithm with weight sharing. We evaluate both algorithms in an environment where agents have to reach a target state while avoiding collisions with obstacles and with other agents. Our results show that agent behaviors after learning policies when following MA-CPT-Q and MA-CPT-Q-WS are better aligned with that of human users who might be placed in the same environment.
引用
收藏
页码:9 / 16
页数:8
相关论文
共 50 条
  • [41] On Centralized Critics in Multi-Agent Reinforcement Learning
    Lyu, Xueguang
    Baisero, Andrea
    Xiao, Yuchen
    Daley, Brett
    Amato, Christopher
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2023, 77 : 295 - 354
  • [42] Deep Multi-Agent Reinforcement Learning: A Survey
    Liang X.-X.
    Feng Y.-H.
    Ma Y.
    Cheng G.-Q.
    Huang J.-C.
    Wang Q.
    Zhou Y.-Z.
    Liu Z.
    Zidonghua Xuebao/Acta Automatica Sinica, 2020, 46 (12): : 2537 - 2557
  • [43] Coordination as inference in multi-agent reinforcement learning
    Li, Zhiyuan
    Wu, Lijun
    Su, Kaile
    Wu, Wei
    Jing, Yulin
    Wu, Tong
    Duan, Weiwei
    Yue, Xiaofeng
    Tong, Xiyi
    Han, Yizhou
    NEURAL NETWORKS, 2024, 172
  • [44] Multi-agent reinforcement learning: weighting and partitioning
    Sun, R
    Peterson, T
    NEURAL NETWORKS, 1999, 12 (4-5) : 727 - 753
  • [45] Multi-Agent Reinforcement Learning and Chimpanzee Hunting
    Sauter, Michael Z.
    Shi, Dongqing
    Kralik, Jerald D.
    2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO 2009), VOLS 1-4, 2009, : 622 - 626
  • [46] A modular approach to multi-agent reinforcement learning
    Ono, N
    Fukumoto, K
    DISTRIBUTED ARTIFICIAL INTELLIGENCE MEETS MACHINE LEARNING: LEARNING IN MULTI-AGENT ENVIRONMENTS, 1997, 1221 : 25 - 39
  • [47] Lenient Multi-Agent Deep Reinforcement Learning
    Palmer, Gregory
    Tuyls, Karl
    Bloembergen, Daan
    Savani, Rahul
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), 2018, : 443 - 451
  • [48] Multi-agent deep reinforcement learning: a survey
    Gronauer, Sven
    Diepold, Klaus
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (02) : 895 - 943
  • [49] AUTOTELIC REINFORCEMENT LEARNING IN MULTI-AGENT ENVIRONMENTS
    Nisioti, Eleni
    Masquil, Elias
    Hamon, Gautier
    Moulin-Frier, Clement
    CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232, 2023, 232 : 137 - 161
  • [50] Experience generalization for multi-agent reinforcement learning
    Pegoraro, R
    Costa, AHR
    Ribeiro, CHC
    SCCC 2001: XXI INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY, PROCEEDINGS, 2001, : 233 - 239