Near-Optimal Interdiction of Factored MDPs

被引:0
|
作者
Panda, Swetasudha [1 ]
Vorobeychik, Yevgeniy [1 ]
机构
[1] Vanderbilt Univ, Elect Engn & Comp Sci, 221 Kirkland Hall, Nashville, TN 37235 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stackelberg games have been widely used to model interactions between attackers and defenders in a broad array of security domains. One related approach involves plan interdiction, whereby a defender chooses a subset of actions to block (remove), and the attacker constructs an optimal plan in response. In previous work, this approach has been introduced in the context of Markov decision processes (MDPs). The key challenge, however, is that the state space of MDPs grows exponentially in the number of state variables. We propose a novel scalable MDP interdiction framework which makes use of factored representation of state, using a parity function basis for representing a value function over a Boolean space. We demonstrate that our approach is significantly more scalable than prior art, while resulting in near-optimal interdiction decisions.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Near-optimal Reinforcement Learning in Factored MDPs
    Osband, Ian
    Van Roy, Benjamin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [2] Scalable Initial State Interdiction for Factored MDPs
    Panda, Swetasudha
    Vorobeychik, Yevgeniy
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4801 - 4807
  • [3] Near-optimal PAC bounds for discounted MDPs
    Lattimore, Tor
    Hutter, Marcus
    THEORETICAL COMPUTER SCIENCE, 2014, 558 : 125 - 143
  • [4] Maximizing Reachability in Factored MDPs via Near-Optimal Clustering with Applications to Control of Multi-Agent Systems
    Fiscko, Carmel
    Kar, Soummya
    Sinopoli, Bruno
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 7970 - 7975
  • [5] Near-Optimal Sample Complexity Bounds for Constrained MDPs
    Vaswani, Sharan
    Yang, Lin F.
    Szepesvari, Csaba
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [6] Methodology for identifying near-optimal interdiction strategies for a power transmission system
    Bier, Vicki M.
    Gratz, Ell R.
    Haphuriwat, Naraphorn J.
    Magua, Wairimu
    Wierzblcki, Kevin R.
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2007, 92 (09) : 1155 - 1161
  • [7] Factored MDPs for Optimal Prosumer Decision-Making
    Angelidakis, Angelos
    Chalkiadakis, Georgios
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 503 - 511
  • [8] Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs
    He, Jiafan
    Zhou, Dongruo
    Gu, Quanquan
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [9] Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs
    Mao, Weichao
    Zhang, Kaiqing
    Zhu, Ruihao
    Simchi-Levi, David
    Basar, Tamer
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [10] Multiagent planning with factored MDPs
    Guestrin, C
    Koller, D
    Parr, R
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 1523 - 1530