Labeling Q-learning in hidden state environments

被引:0
|
作者
Hae-Yeon Lee
Hiroyuki Kamaya
Ken-ichi Abe
机构
[1] Tohoku University,Department of Electric and Communication Engineering, Graduate School of Engineering
[2] Hachinohe National College of Technology,Department of Electrical Engineering
关键词
Reinforcement learning; Labeling Q-learning; Hidden states environment; Agent; Grid-world; Partially observable Markov decision process (POMDP);
D O I
10.1007/BF02481264
中图分类号
学科分类号
摘要
Recently,reinforcement learning (RL) methods have been used for learning problems in environments with embedded hidden states. However, conventional RL methods have been limited to handlingMarkov decision process problems. In order to overcome hidden states, several algorithms were proposed, but these need an extreme amount of memory of past sequences which represent historical state transitions. The aim of this work was to extend our previously proposed algorithm for hidden states in an environment, calledlabeling Q-learning (LQ-learning), which reinforces incompletely observed perception by labeling. In LQ-learning, the agent has a perception structure which consists of pairs of observations and labels. From these pairs, the agent can distinguish more exactly hidden states which look the same but are actually different each other. Labeling is carried out by labeling functions. Numerous labeling functions can be considered, but here we introduce some labeling functions based on the sequence of only the last and the current observations. This extended LQ-learning is applied to grid-world problems which have hidden states. The results of these simulations show the availability of LQ-learning.
引用
收藏
页码:181 / 184
页数:3
相关论文
共 50 条
  • [31] Detection of Hidden Moving Targets by a Group of Mobile Agents with Deep Q-Learning
    Matzliach, Barouch
    Ben-Gal, Irad
    Kagan, Evgeny
    ROBOTICS, 2023, 12 (04)
  • [32] CVaR Q-Learning
    Stanko, Silvestr
    Macek, Karel
    COMPUTATIONAL INTELLIGENCE: 11th International Joint Conference, IJCCI 2019, Vienna, Austria, September 17-19, 2019, Revised Selected Papers, 2021, 922 : 333 - 358
  • [33] Bayesian Q-learning
    Dearden, R
    Friedman, N
    Russell, S
    FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 761 - 768
  • [34] Zap Q-Learning
    Devraj, Adithya M.
    Meyn, Sean P.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [35] Convex Q-Learning
    Lu, Fan
    Mehta, Prashant G.
    Meyn, Sean P.
    Neu, Gergely
    2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 4749 - 4756
  • [36] Experimental Research on Avoidance Obstacle Control for Mobile Robots Using Q-Learning (QL) and Deep Q-Learning (DQL) Algorithms in Dynamic Environments
    Ha, Vo Thanh
    Vinh, Vo Quang
    ACTUATORS, 2024, 13 (01)
  • [37] Fuzzy Q-learning
    Glorennec, PY
    Jouffe, L
    PROCEEDINGS OF THE SIXTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS I - III, 1997, : 659 - 662
  • [38] Q-learning and robotics
    Touzet, CF
    Santos, JM
    SIMULATION IN INDUSTRY 2001, 2001, : 685 - 689
  • [39] Periodic Q-Learning
    Lee, Donghwan
    He, Niao
    LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 582 - 598
  • [40] Q-learning automaton
    Qian, F
    Hirata, H
    IEEE/WIC INTERNATIONAL CONFERENCE ON INTELLIGENT AGENT TECHNOLOGY, PROCEEDINGS, 2003, : 432 - 437