Near-Optimal Interdiction of Factored MDPs

被引：0

作者：

Panda, Swetasudha ^{[1
]}

Vorobeychik, Yevgeniy ^{[1
]}

机构：

[1] Vanderbilt Univ, Elect Engn & Comp Sci, 221 Kirkland Hall, Nashville, TN 37235 USA

来源：

CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017) | 2017年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Stackelberg games have been widely used to model interactions between attackers and defenders in a broad array of security domains. One related approach involves plan interdiction, whereby a defender chooses a subset of actions to block (remove), and the attacker constructs an optimal plan in response. In previous work, this approach has been introduced in the context of Markov decision processes (MDPs). The key challenge, however, is that the state space of MDPs grows exponentially in the number of state variables. We propose a novel scalable MDP interdiction framework which makes use of factored representation of state, using a parity function basis for representing a value function over a Boolean space. We demonstrate that our approach is significantly more scalable than prior art, while resulting in near-optimal interdiction decisions.

引用

页数：10

共 50 条

[31] Near-Optimal Light Spanners
Chechik, Shiri
Wulff-Nilsen, Christian
ACM TRANSACTIONS ON ALGORITHMS, 2018, 14 (03)
[32] TeXDYNA: Hierarchical Reinforcement Learning in Factored MDPs
Kozlova, Olga
Sigaud, Olivier
Meyer, Christophe
FROM ANIMALS TO ANIMATS 11, 2010, 6226 : 489 - +
[33] Near-optimal block alignments
Tseng, Kuo-Tsung
Yang, Chang-Biau
Huang, Kuo-Si
Peng, Yung-Hsing
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (03): : 789 - 795
[34] Near-optimal list colorings
Molloy, M
Reed, B
RANDOM STRUCTURES & ALGORITHMS, 2000, 17 (3-4) : 376 - 402
[35] Near-optimal sequence alignment
Vingron, M
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1996, 6 (03) : 346 - 352
[36] Optimal, near-optimal, and robust epidemic control
Dylan H. Morris
Fernando W. Rossine
Joshua B. Plotkin
Simon A. Levin
Communications Physics, 4
[37] Optimal, near-optimal, and robust epidemic control
Morris, Dylan H.
Rossine, Fernando W.
Plotkin, Joshua B.
Levin, Simon A.
COMMUNICATIONS PHYSICS, 2021, 4 (01)
[38] OPTIMAL AND NEAR-OPTIMAL BROADCAST IN RANDOM GRAPHS
SCHEINERMAN, ER
WIERMAN, JC
DISCRETE APPLIED MATHEMATICS, 1989, 25 (03) : 289 - 297
[39] Pseudo-MDPs and Factored Linear Action Models
Yao, Hengshuai
Szepesvari, Csaba
Pires, Bernardo Avila
Zhang, Xinhua
2014 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING (ADPRL), 2014, : 189 - 197
[40] Piecewise linear value function approximation for factored MDPs
Poupart, P
Boutilier, C
Patrascu, R
Schuurmans, D
EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, 2002, : 292 - 299

← 1 2 3 4 5 →