Multi-Agent Reinforcement Learning with Prospect Theory

被引：0

作者：

Danis, Dominic ^{[1
]}

Parmacek, Parker ^{[1
]}

Dunajsky, David ^{[1
]}

Ramasubramanian, Bhaskar ^{[1
]}

机构：

[1] Western Washington Univ, Elect & Comp Engn, Bellingham, WA 98225 USA

来源：

2023 PROCEEDINGS OF THE CONFERENCE ON CONTROL AND ITS APPLICATIONS, CT | 2023年

基金：

美国国家科学基金会;

关键词：

DECISION; RISK;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recent advances in cyber and cyber-physical systems have informed the development of scalable and efficient algorithms for these systems to learn behaviors when operating in uncertain and unknown environments. When such systems share their operating environments with human users, such as in autonomous driving, it is important to be able to learn behaviors for each entity in the environment that will (i) recognize presence of other entities, and (ii) be aligned with preferences of one or more human users in the environment. While multiagent reinforcement learning (MARL) provides a modeling, design, and analysis paradigm for (i), there remains a gap in the development of strategies to solve (ii). In this paper, we aim to bridge this gap through the design, analysis, and evaluation of MARL algorithms that recognize preferences of human users. We use cumulative prospect theory (CPT) to model multiple human traits such as a tendency to view gains and losses differently, and to evaluate outcomes relative to a reference point. We define a CPT-based value function, and learn agent policies as a consequence of optimizing this value function. To this end, we develop MA-CPT-Q, a multi-agent CPT-based Q-learning algorithm, and establish its convergence. We adapt this algorithm to a setting where any agent can call upon 'more experienced' agents to aid its own learning process, and propose MA-CPT-Q-WS, a multi-agent CPT-based Q-learning algorithm with weight sharing. We evaluate both algorithms in an environment where agents have to reach a target state while avoiding collisions with obstacles and with other agents. Our results show that agent behaviors after learning policies when following MA-CPT-Q and MA-CPT-Q-WS are better aligned with that of human users who might be placed in the same environment.

引用

页码：9 / 16

页数：8

共 50 条

[31] Multi-agent reinforcement learning with adaptive mimetism
Yamaguchi, T
Miura, M
Yachida, M
ETFA '96 - 1996 IEEE CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION, PROCEEDINGS, VOLS 1 AND 2, 1996, : 288 - 294
[32] HALFTONING WITH MULTI-AGENT DEEP REINFORCEMENT LEARNING
Jiang, Haitian
Xiong, Dongliang
Jiang, Xiaowen
Yin, Aiguo
Ding, Li
Huang, Kai
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 641 - 645
[33] Multi-Agent Reinforcement Learning with Reward Delays
Zhang, Yuyang
Zhang, Runyu
Gu, Yuantao
Li, Na
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
[34] Deep reinforcement learning for multi-agent interaction
Ahmed, Ibrahim H.
Brewitt, Cillian
Carlucho, Ignacio
Christianos, Filippos
Dunion, Mhairi
Fosong, Elliot
Garcin, Samuel
Guo, Shangmin
Gyevnar, Balint
McInroe, Trevor
Papoudakis, Georgios
Rahman, Arrasy
Schafer, Lukas
Tamborski, Massimiliano
Vecchio, Giuseppe
Wang, Cheng
Albrecht, Stefano, V
AI COMMUNICATIONS, 2022, 35 (04) : 357 - 368
[35] Quantum Multi-Agent Meta Reinforcement Learning
Yun, Won Joon
Park, Jihong
Kim, Joongheon
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 11087 - 11095
[36] Multi-agent reinforcement learning for intrusion detection
Servin, Arturo
Kudenko, Daniel
ADAPTIVE AGENTS AND MULTI-AGENT SYSTEMS, 2008, 4865 : 211 - 223
[37] Multi-Agent Adversarial Inverse Reinforcement Learning
Yu, Lantao
Song, Jiaming
Ermon, Stefano
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[38] Reinforcement learning based on multi-agent in RoboCup
Zhang, W
Li, JG
Ruan, XG
ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 : 967 - 975
[39] A Review of Multi-Agent Reinforcement Learning Algorithms
Liang, Jiaxin
Miao, Haotian
Li, Kai
Tan, Jianheng
Wang, Xi
Luo, Rui
Jiang, Yueqiu
ELECTRONICS, 2025, 14 (04):
[40] Multi-agent deep reinforcement learning: a survey
Sven Gronauer
Klaus Diepold
Artificial Intelligence Review, 2022, 55 : 895 - 943

← 1 2 3 4 5 →