共 50 条
TD algorithm for the variance of return and mean-variance reinforcement learning
被引:0
|作者:
Sato, Makoto
[1
]
Kimura, Hajime
[1
]
Kobayashi, Shibenobu
[1
]
机构:
[1] Interdisc. Grad. Sch. Sci. and Eng., Tokyo Institute of Technology
关键词:
D O I:
10.1527/tjsai.16.353
中图分类号:
学科分类号:
摘要:
26
引用
收藏
页码:353 / 362
相关论文