Training;
Perturbation methods;
Heuristic algorithms;
Closed box;
Optimization;
Control systems;
Vehicle dynamics;
Adversarial attack;
deep reinforcement learning (DRL);
defense;
security;
traffic signal control;
ROBUSTNESS;
D O I:
10.1109/TII.2024.3413356
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
Multiagent reinforcement learning (MARL) promises outstanding performance for multiintersection traffic signal control systems (TSCS), enabling intelligent administration of cities. However, the vulnerability of MARL algorithms to adversarial attacks has raised concerns about the security of TSCS. In this article, we explore the robustness of MARL-based TSCS against adversarial attacks, propose a black-box multiobject attack strategy, and assign an attack budget to ensure stealthiness. We design a dynamic threshold-based selection of critical states to minimize the cumulative reward with a limited number of attacks. In addition, we present a lightweight agnostic dynamic threshold-based defense mechanism by enhancing the worst-case performance of the policy. We formulate it as a min-max optimization problem, i.e., minimizing the quantity of training sample alterations while maximizing the cumulative discount reward of policy against the perturbed states. Extensive experiments on simulation of urban mobility (SUMO) demonstrate that the proposed attack policy can significantly reduce the performance of TSCS.