Labeling Q-learning in hidden state environments

被引:0
|
作者
Hae-Yeon Lee
Hiroyuki Kamaya
Ken-ichi Abe
机构
[1] Tohoku University,Department of Electric and Communication Engineering, Graduate School of Engineering
[2] Hachinohe National College of Technology,Department of Electrical Engineering
关键词
Reinforcement learning; Labeling Q-learning; Hidden states environment; Agent; Grid-world; Partially observable Markov decision process (POMDP);
D O I
10.1007/BF02481264
中图分类号
学科分类号
摘要
Recently,reinforcement learning (RL) methods have been used for learning problems in environments with embedded hidden states. However, conventional RL methods have been limited to handlingMarkov decision process problems. In order to overcome hidden states, several algorithms were proposed, but these need an extreme amount of memory of past sequences which represent historical state transitions. The aim of this work was to extend our previously proposed algorithm for hidden states in an environment, calledlabeling Q-learning (LQ-learning), which reinforces incompletely observed perception by labeling. In LQ-learning, the agent has a perception structure which consists of pairs of observations and labels. From these pairs, the agent can distinguish more exactly hidden states which look the same but are actually different each other. Labeling is carried out by labeling functions. Numerous labeling functions can be considered, but here we introduce some labeling functions based on the sequence of only the last and the current observations. This extended LQ-learning is applied to grid-world problems which have hidden states. The results of these simulations show the availability of LQ-learning.
引用
收藏
页码:181 / 184
页数:3
相关论文
共 50 条
  • [21] Learning rates for Q-Learning
    Even-Dar, E
    Mansour, Y
    COMPUTATIONAL LEARNING THEORY, PROCEEDINGS, 2001, 2111 : 589 - 604
  • [22] Learning rates for Q-learning
    Even-Dar, E
    Mansour, Y
    JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 5 : 1 - 25
  • [23] Entropy and PCA Analysis for Environments Associated to Q-Learning for Path Finding
    Garcia-Quijada, Manuel
    Gorrostieta-Hurtado, Efren
    Emilio Vargas-Soto, Jose
    Toledano-Ayala, Manuel
    APPLICATIONS OF COMPUTATIONAL INTELLIGENCE, COLCACI 2019, 2019, 1096 : 209 - 222
  • [24] Entropy and PCA Analysis for Environments Associated to Q-Learning for Path Finding
    Garcia-Quijada, Manuel
    Gorrostieta-Hurtado, Efren
    Emilio Vargas-Soto, Jose
    Toledano-Ayala, Manuel
    2019 IEEE COLOMBIAN CONFERENCE ON APPLICATIONS IN COMPUTATIONAL INTELLIGENCE (COLCACI), 2019,
  • [25] Multi-scale Q-Learning of A Mobile Robot in Dynamic Environments
    Takase, Noriko
    Kubota, Naoyuki
    Baba, Norio
    6TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS, AND THE 13TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS, 2012, : 1248 - 1252
  • [26] Swarm Q-Learning With Knowledge Sharing Within Environments for Formation Control
    Tung Nguyen
    Hung Nguyen
    Debie, Essam
    Kasmarik, Kathryn
    Garratt, Matthew
    Abbass, Hussein
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [27] Non-Reciprocating Sharing Methods in Cooperative Q-Learning Environments
    Cunningham, Bryan
    Cao, Yong
    2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 2, 2012, : 212 - 219
  • [28] Maximize Producer Rewards in Distributed Windmill Environments: A Q-Learning Approach
    Li, Bei
    Gangadhar, Siddharth
    Verma, Pramode
    Cheng, Samuel
    AIMS ENERGY, 2015, 3 (01) : 162 - 172
  • [29] Cooperative Deep Q-Learning Framework for Environments Providing Image Feedback
    Raghavan, Krishnan
    Narayanan, Vignesh
    Jagannathan, Sarangapani
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (07) : 9267 - 9276
  • [30] Contextual Q-Learning
    Pinto, Tiago
    Vale, Zita
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2927 - 2928