A sparse sampling algorithm for near-optimal planning in large Markov decision processes

被引：0

作者：

Kearns, M ^{[1
]}

Mansour, Y ^{[1
]}

Ng, AY ^{[1
]}

机构：

[1] AT&T Labs, Murray Hill, NJ 07974 USA

来源：

IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2 | 1999年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An issue that is critical for the application of Markov decision processes (MDPs) to realistic problems is how the complexity of planning scales with the size of the MDP. In stochastic environments with very large or even infinite state spaces, traditional planning and reinforcement learning algorithms are often inapplicable, since their running time typically scales linearly with the state space size. In this paper we present a new algorithm that, given only a generative model (simulator) for an arbitrary MDP, performs near-optimal planning with a running time that has no dependence on the number of states. Although the running time is exponential in the horizon time (which depends only on the discount factor gamma and the desired degree of approximation to the optimal policy), our results establish for the first time that there are no theoretical barriers to computing near-optimal policies in arbitrarily large, unstructured MDPs. Our algorithm is based on the idea of sparse sampling. We prove that a randomly sampled look-ahead tree that covers only a vanishing fraction of the full look-ahead tree nevertheless suffices to compute near-optimal actions from any state of an MDP. Practical implementations of the algorithm are discussed, and we draw ties to our related recent results on finding a near-best strategy from a given class of strategies in very large partially observable MDPs [KMN99].

引用

页码：1324 / 1331

页数：8

共 50 条

[1] A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes
Michael Kearns
Yishay Mansour
Andrew Y. Ng
Machine Learning, 2002, 49 : 193 - 208
[2] A sparse sampling algorithm for near-optimal planning in large Markov decision processes
Kearns, M
Mansour, Y
Ng, AY
MACHINE LEARNING, 2002, 49 (2-3) : 193 - 208
[3] DETERMINING NEAR-OPTIMAL POLICIES FOR MARKOV RENEWAL DECISION PROCESSES
BOYSE, JW
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1974, MC 4 (02): : 215 - 217
[4] Near-Optimal Randomized Exploration for Tabular Markov Decision Processes
Xiong, Zhihan
Shen, Ruoqi
Cui, Qiwen
Fazel, Maryam
Du, Simon S.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[5] Near-Optimal Entrywise Sampling of Numerically Sparse Matrices
Braverman, Vladimir
Krauthgamer, Robert
Krishnan, Aditya
Sapir, Shay
CONFERENCE ON LEARNING THEORY, VOL 134, 2021, 134 : 759 - 773
[6] Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model
Sidford, Aaron
Wang, Mengdi
Wu, Xian
Yang, Lin F.
Ye, Yinyu
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[7] A minimax near-optimal algorithm for adaptive rejection sampling
Achddou, Juliette
Lam-Weil, Joseph
Carpentier, Alexandra
Blanchard, Gilles
ALGORITHMIC LEARNING THEORY, VOL 98, 2019, 98
[8] Sparse roadmap spanners for asymptotically near-optimal motion planning
Dobson, Andrew
Bekris, Kostas E.
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2014, 33 (01): : 18 - 47
[9] Layered Gibbs Sampling Algorithm For Near-Optimal Detection in Large-MIMO Systems
Mandloi, Manish
Bhatia, Vimal
2017 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2017,
[10] On a near-optimal and efficient algorithm for the sparse pooled data problem
Hahn-klimroth, Max
van der Hofstad, Remco
Mueller, Noela
Riddlesden, Connor
BERNOULLI, 2025, 31 (02) : 1579 - 1605

← 1 2 3 4 5 →