Backtracking for More Efficient Large Scale Dynamic Programming

被引：1

作者：

Tripp, Charles ^{[1
]}

Shachter, Ross ^{[2
]}

机构：

[1] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA

[2] Stanford Univ, Dept Management Sci & Engn, Stanford, CA 94305 USA

来源：

2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 1 | 2012年

关键词：

dynamic programming; reinforcement learning; Q-Learning; experience replay; backtracking;

D O I：

10.1109/ICMLA.2012.63

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning algorithms are widely used to generate policies for complex Markov decision processes. We introduce backtracking, a modification to reinforcement learning algorithms that can significantly improve their performance, particularly for off-line policy generation. Backtracking waits to perform update calculations until the successor's value has been updated, allowing immediate reuse of update calculations. We demonstrate the effectiveness of backtracking on two benchmark processes using both Q-learning and real-time dynamic programming.

引用

页码：338 / 343

页数：6

共 50 条

[11] Optimization of a large-scale water reservoir network by stochastic dynamic programming with efficient state space discretization
Cervellera, C
Chen, VCP
Wen, AH
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2006, 171 (03) : 1139 - 1151
[12] Parallel, scalable, memory-efficient backtracking for combinatorial modeling of large-scale biological systems
Park, Byung-Hoon
Schmidt, Matthew
Thomas, Kevin
Karpinets, Tatiana
Samatova, Nagiza F.
2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, 2008, : 536 - +
[13] Efficient and verifiable outsourcing computation of large-scale nonlinear programming
Mohammed, Nedal M.
AL-Seadi, Ali N.
Lomte, Santosh S.
Rokade, Poonam M.
Hamoud, Ahmed A.
JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE-JMCS, 2020, 21 (04): : 335 - 343
[14] An Efficient Approach to Solve the Large-Scale Semidefinite Programming Problems
Zheng, Yongbin
Yan, Yuzhuang
Liu, Sheng
Huang, Xinsheng
Xu, Wanying
MATHEMATICAL PROBLEMS IN ENGINEERING, 2012, 2012
[15] Efficient large-scale configuration via integer linear programming
Feinerer, Ingo
AI EDAM-ARTIFICIAL INTELLIGENCE FOR ENGINEERING DESIGN ANALYSIS AND MANUFACTURING, 2013, 27 (01): : 37 - 49
[16] Approximate Dynamic Programming for Large-scale Unit Commitment Problems
Long, Danli
10TH ASIA-PACIFIC POWER AND ENERGY ENGINEERING CONFERENCE (APPEEC 2018), 2018, : 353 - 362
[17] Dynamic programming neural network for large-scale optimization problems
Hou, Zengguang
Wu, Cangpu
Zidonghua Xuebao/Acta Automatica Sinica, 1999, 25 (01): : 45 - 51
[18] A Dynamic Programming Framework for Large-Scale Online Clustering on Graphs
Li, Yantao
Zhao, Xiang
Qu, Zehui
NEURAL PROCESSING LETTERS, 2020, 52 (02) : 1613 - 1629
[19] A Dynamic Programming Framework for Large-Scale Online Clustering on Graphs
Yantao Li
Xiang Zhao
Zehui Qu
Neural Processing Letters, 2020, 52 : 1613 - 1629
[20] Dynamic Backtracking
Ginsberg, Matthew L.
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1993, 1 : 25 - 46

← 1 2 3 4 5 →