Batch reinforcement learning with state importance

被引:0
|
作者
Li, LH [1 ]
Bulitko, V [1 ]
Greiner, R [1 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB T6G 2E8, Canada
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the problem of using function approximation in rein forcement learning where the agent's policy is represented as a classifier mapping states to actions. High classification accuracy is usually deemed to correlate with high policy quality. But this is not necessarily the case as increasing classification accuracy can actually decrease the policy's quality. This phenomenon takes place when the learning process begins to focus on classifying less "important" states. In this paper, we introduce a measure of state's decision-making importance that can be used to improve policy learning. As a result, the focused learning process is shown to converge faster to better policies(1).
引用
收藏
页码:566 / 568
页数:3
相关论文
共 50 条
  • [31] Safe HVAC Control via Batch Reinforcement Learning
    Liu, Hsin-Yu
    Balaji, Bharathan
    Gao, Sicun
    Gupta, Rajesh
    Hong, Dezhi
    2022 13TH ACM/IEEE INTERNATIONAL CONFERENCE ON CYBER-PHYSICAL SYSTEMS (ICCPS 2022), 2022, : 181 - 192
  • [32] Batch Reinforcement Learning for Smart Home Energy Management
    Berlink, Heider
    Reali Costa, Anna Helena
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 2561 - 2567
  • [33] Reinforcement learning state estimator
    Morimoto, Jun
    Doya, Kenji
    NEURAL COMPUTATION, 2007, 19 (03) : 730 - 756
  • [34] Importance Weighted Transfer of Samples in Reinforcement Learning
    Tirinzoni, Andrea
    Sessa, Andrea
    Pirotta, Matteo
    Restelli, Marcello
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [35] Efficient Batch-Mode Reinforcement Learning Using Extreme Learning Machines
    Liu, Jiahang
    Zuo, Lei
    Xu, Xin
    Zhang, Xinglong
    Ren, Junkai
    Fang, Qiang
    Liu, Xinwang
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2021, 51 (06): : 3664 - 3677
  • [36] Batch mode reinforcement learning based on the synthesis of artificial trajectories
    Fonteneau, Raphael
    Murphy, Susan A.
    Wehenkel, Louis
    Ernst, Damien
    ANNALS OF OPERATIONS RESEARCH, 2013, 208 (01) : 383 - 416
  • [37] Batch Reinforcement Learning for controlling a Mobile Wheeled Pendulum robot
    Bonarini, Andrea
    Caccia, Claudio
    Lazaric, Alessandro
    Restelli, Marcello
    ARTIFICIAL INTELLIGENCE IN THEORY AND PRACTICE II, 2008, 276 : 151 - +
  • [38] Safe Building HVAC Control via Batch Reinforcement Learning
    Zhang, Chi
    Kuppannagari, Sanmukh Rao
    Prasanna, Viktor K.
    IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2022, 7 (04): : 923 - 934
  • [39] Battery Energy Management in a Microgrid Using Batch Reinforcement Learning
    Mbuwir, Brida V.
    Ruelens, Frederik
    Spiessens, Fred
    Deconinck, Geert
    ENERGIES, 2017, 10 (11):
  • [40] Information-Theoretic Generalization Bounds for Batch Reinforcement Learning
    Liu, Xingtu
    ENTROPY, 2024, 26 (11)