Near-Optimal Interdiction of Factored MDPs

被引：0

作者：

Panda, Swetasudha ^{[1
]}

Vorobeychik, Yevgeniy ^{[1
]}

机构：

[1] Vanderbilt Univ, Elect Engn & Comp Sci, 221 Kirkland Hall, Nashville, TN 37235 USA

来源：

CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017) | 2017年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Stackelberg games have been widely used to model interactions between attackers and defenders in a broad array of security domains. One related approach involves plan interdiction, whereby a defender chooses a subset of actions to block (remove), and the attacker constructs an optimal plan in response. In previous work, this approach has been introduced in the context of Markov decision processes (MDPs). The key challenge, however, is that the state space of MDPs grows exponentially in the number of state variables. We propose a novel scalable MDP interdiction framework which makes use of factored representation of state, using a parity function basis for representing a value function over a Boolean space. We demonstrate that our approach is significantly more scalable than prior art, while resulting in near-optimal interdiction decisions.

引用

页数：10

共 50 条

[1] Near-optimal Reinforcement Learning in Factored MDPs
Osband, Ian
Van Roy, Benjamin
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
[2] Scalable Initial State Interdiction for Factored MDPs
Panda, Swetasudha
Vorobeychik, Yevgeniy
PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4801 - 4807
[3] Near-optimal PAC bounds for discounted MDPs
Lattimore, Tor
Hutter, Marcus
THEORETICAL COMPUTER SCIENCE, 2014, 558 : 125 - 143
[4] Maximizing Reachability in Factored MDPs via Near-Optimal Clustering with Applications to Control of Multi-Agent Systems
Fiscko, Carmel
Kar, Soummya
Sinopoli, Bruno
2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 7970 - 7975
[5] Near-Optimal Sample Complexity Bounds for Constrained MDPs
Vaswani, Sharan
Yang, Lin F.
Szepesvari, Csaba
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[6] Methodology for identifying near-optimal interdiction strategies for a power transmission system
Bier, Vicki M.
Gratz, Ell R.
Haphuriwat, Naraphorn J.
Magua, Wairimu
Wierzblcki, Kevin R.
RELIABILITY ENGINEERING & SYSTEM SAFETY, 2007, 92 (09) : 1155 - 1161
[7] Factored MDPs for Optimal Prosumer Decision-Making
Angelidakis, Angelos
Chalkiadakis, Georgios
PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 503 - 511
[8] Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs
He, Jiafan
Zhou, Dongruo
Gu, Quanquan
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
[9] Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs
Mao, Weichao
Zhang, Kaiqing
Zhu, Ruihao
Simchi-Levi, David
Basar, Tamer
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[10] Multiagent planning with factored MDPs
Guestrin, C
Koller, D
Parr, R
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 1523 - 1530

← 1 2 3 4 5 →