Differential Advising in Multiagent Reinforcement Learning

被引：18

作者：

Ye, Dayong ^{[1
,2
]}

Zhu, Tianqing ^{[4
]}

Cheng, Zishuo ^{[1
,2
]}

Zhou, Wanlei ^{[3
]}

Yu, Philip S. ^{[5
]}

机构：

[1] Univ Technol Sydney, Ctr Cyber Secur & Privacy, Ultimo, NSW 2007, Australia

[2] Univ Technol Sydney, Sch Comp Sci, Ultimo, NSW 2007, Australia

[3] City Univ Macau, Macau, Peoples R China

[4] China Univ Geosci, Sch Comp Sci, Wuhan 430000, Peoples R China

[5] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2022年 / 52卷 / 06期

基金：

中国国家自然科学基金; 澳大利亚研究理事会;

关键词：

Robots; Differential privacy; Privacy; Reinforcement learning; Task analysis; Sensitivity; Computer science; Agent advising; differential privacy; multiagent reinforcement learning (MARL);

D O I：

10.1109/TCYB.2020.3034424

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Agent advising is one of the main approaches to improve agent learning performance by enabling agents to share advice. Existing advising methods have a common limitation that an adviser agent can offer advice to an advisee agent only if the advice is created in the same state as the advisee's state. However, in complex environments, it is a very strong requirement that two states are the same, because a state may consist of multiple dimensions and two states being the same means that all these dimensions in the two states are correspondingly identical. Therefore, this requirement may limit the applicability of existing advising methods to complex environments. In this article, inspired by the differential privacy scheme, we propose a differential advising method that relaxes this requirement by enabling agents to use advice in a state even if the advice is created in a slightly different state. Compared with the existing methods, agents using the proposed method have more opportunity to take advice from others. This article is the first to adopt the concept of differential privacy on advising to improve agent learning performance instead of addressing security issues. The experimental results demonstrate that the proposed method is more efficient in complex environments than the existing methods.

引用

页码：5508 / 5521

页数：14

共 50 条

[1] Simultaneously Learning and Advising in Multiagent Reinforcement Learning
da Silva, Felipe Leno
Glatt, Ruben
Reali Costa, Anna Helena
AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 1100 - 1108
[2] An Advising Framework for Multiagent Reinforcement Learning Systems
da Silva, Felipe Leno
Glatt, Ruben
Reali Costa, Anna Helena
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4913 - 4914
[3] Automated design of action advising trigger conditions for multiagent reinforcement A
Wang, Tonghao
Peng, Xingguang
Wang, Tao
Liu, Tong
Xu, Demin
SWARM AND EVOLUTIONARY COMPUTATION, 2024, 85
[4] Asymmetric multiagent reinforcement learning
Könönen, V
IEEE/WIC INTERNATIONAL CONFERENCE ON INTELLIGENT AGENT TECHNOLOGY, PROCEEDINGS, 2003, : 336 - 342
[5] Differentially Private Malicious Agent Avoidance in Multiagent Advising Learning
Ye, Dayong
Zhu, Tianqing
Zhou, Wanlei
Yu, Philip S.
IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (10) : 4214 - 4227
[6] Lateral Transfer Learning for Multiagent Reinforcement Learning
Shi, Haobin
Li, Jingchen
Mao, Jiahui
Hwang, Kao-Shing
IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (03) : 1699 - 1711
[7] Learning to Teach in Cooperative Multiagent Reinforcement Learning
Omidshafiei, Shayegan
Kim, Dong-Ki
Liu, Miao
Tesauro, Gerald
Riemer, Matthew
Amato, Christopher
Campbell, Murray
How, Jonathan P.
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6128 - 6136
[8] Learning Cooperative Behaviours in Multiagent Reinforcement Learning
Phon-Amnuaisuk, Somnuk
NEURAL INFORMATION PROCESSING, PT 1, PROCEEDINGS, 2009, 5863 : 570 - 579
[9] Interaction Models for Multiagent Reinforcement Learning
Ribeiro, Richardson
Borges, Andre P.
Enembreck, Fabricio
2008 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING CONTROL & AUTOMATION, VOLS 1 AND 2, 2008, : 464 - +
[10] Dynamic Pricing by Multiagent Reinforcement Learning
Han, Wei
Liu, Lingbo
Zheng, Huaili
PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, 2008, : 226 - 229

← 1 2 3 4 5 →