Labeling Q-learning in hidden state environments

被引:0
|
作者
Hae-Yeon Lee
Hiroyuki Kamaya
Ken-ichi Abe
机构
[1] Tohoku University,Department of Electric and Communication Engineering, Graduate School of Engineering
[2] Hachinohe National College of Technology,Department of Electrical Engineering
关键词
Reinforcement learning; Labeling Q-learning; Hidden states environment; Agent; Grid-world; Partially observable Markov decision process (POMDP);
D O I
10.1007/BF02481264
中图分类号
学科分类号
摘要
Recently,reinforcement learning (RL) methods have been used for learning problems in environments with embedded hidden states. However, conventional RL methods have been limited to handlingMarkov decision process problems. In order to overcome hidden states, several algorithms were proposed, but these need an extreme amount of memory of past sequences which represent historical state transitions. The aim of this work was to extend our previously proposed algorithm for hidden states in an environment, calledlabeling Q-learning (LQ-learning), which reinforces incompletely observed perception by labeling. In LQ-learning, the agent has a perception structure which consists of pairs of observations and labels. From these pairs, the agent can distinguish more exactly hidden states which look the same but are actually different each other. Labeling is carried out by labeling functions. Numerous labeling functions can be considered, but here we introduce some labeling functions based on the sequence of only the last and the current observations. This extended LQ-learning is applied to grid-world problems which have hidden states. The results of these simulations show the availability of LQ-learning.
引用
收藏
页码:181 / 184
页数:3
相关论文
共 50 条
  • [41] Mutual Q-learning
    Reid, Cameron
    Mukhopadhyay, Snehasis
    2020 3RD INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTS (ICCR 2020), 2020, : 128 - 133
  • [42] Neural Q-learning
    Stephan ten Hagen
    Ben Kröse
    Neural Computing & Applications, 2003, 12 : 81 - 88
  • [43] Robust Q-Learning
    Ertefaie, Ashkan
    McKay, James R.
    Oslin, David
    Strawderman, Robert L.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2021, 116 (533) : 368 - 381
  • [44] Neural Q-learning
    ten Hagen, S
    Kröse, B
    NEURAL COMPUTING & APPLICATIONS, 2003, 12 (02): : 81 - 88
  • [45] Logistic Q-Learning
    Bas-Serrano, Joan
    Curi, Sebastian
    Krause, Andreas
    Neu, Gergely
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [46] Enhancing Nash Q-learning and Team Q-learning mechanisms by using bottlenecks
    Ghazanfari, Behzad
    Mozayani, Nasser
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2014, 26 (06) : 2771 - 2783
  • [47] Virtual markets:: Q-learning sellers with simple state representation
    Akchurina, Natalia
    Buening, Hans Kleine
    AUTONOMOUS INTELLIGENT SYSTEMS: AGENTS AND DATA MINING, PROCEEDINGS, 2007, 4476 : 192 - +
  • [48] Q-Learning Acceleration via State-space Partitioning
    Wei, Haoran
    Corder, Kevin
    Decker, Keith
    2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 293 - 298
  • [49] Reduction of the dynamic state-space in Fuzzy Q-Learning
    Kovács, S
    Baranyi, N
    2004 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, PROCEEDINGS, 2004, : 1075 - 1080
  • [50] State Distribution-Aware Sampling for Deep Q-Learning
    Weichao Li
    Fuxian Huang
    Xi Li
    Gang Pan
    Fei Wu
    Neural Processing Letters, 2019, 50 : 1649 - 1660