DETECTING OPTIMAL AND NONOPTIMAL ACTIONS IN AVERAGE-COST MARKOV DECISION-PROCESSES

被引:2
|
作者
LASSERRE, JB
机构
关键词
POLICY ITERATION; LINEAR PROGRAMMING; ELIMINATION OF NONOPTIMAL ACTIONS;
D O I
10.2307/3215322
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We present two sufficient conditions for detection of optimal and non-optimal actions in (ergodic) average-cost MDPs. They are easily interpreted and can be implemented as detection tests in both policy iteration and linear programming methods. An efficient implementation of a recent new policy iteration scheme is discussed.
引用
收藏
页码:979 / 990
页数:12
相关论文
共 50 条