An improved Q-learning algorithm based on exploration region expansion strategy

被引:0
|
作者
Gao, Qingji [1 ]
Hong, Bingong [1 ]
He, Zhendong [2 ]
Liu, Jie [2 ]
Niu, Guochen [2 ]
机构
[1] Harbin Inst Technol, Dept Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Civil Aviat Univ China, Inst Res Robot, Tianjin 300300, Peoples R China
关键词
Q-learning; exploration region expansion; exploration-exploitation; Metropolis criterion;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In order to find a good solution to one of the key problems in Q-learning algorithm-keeping the balance between exploration and exploitation, an improved Q-learning algorithm based on exploration region expansion strategy is proposed on the base of Metropolis criterion-based Q-learning. With this strategy, the exploration blindness in the entire environment is eliminated, and the learning efficiency is increased. Meanwhile, other feasible path is sought where agent encounters obstacles, which makes the implementation of the algorithm on real robot easy. An automatic termination condition is also put forward, therefore, the redundant learning after finding optimal path is avoided, and the time of learning is reduced. The validity of the algorithm is proved by simulation experiments.
引用
收藏
页码:4167 / +
页数:2
相关论文
共 50 条
  • [21] Adaptive Routing Strategy Based on Improved Double Q-Learning for Satellite Internet of Things
    Zhou, Jian
    Gong, Xiaotian
    Sun, Lijuan
    Xie, Yong
    Yan, Xiaoyong
    SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
  • [22] An Improved Q-Learning Algorithm for Optimizing Sustainable Remanufacturing Systems
    Qin, Shujin
    Zhang, Xiaofei
    Wang, Jiacun
    Guo, Xiwang
    Qi, Liang
    Cao, Jinrui
    Liu, Yizhi
    SUSTAINABILITY, 2024, 16 (10)
  • [23] A Self-Adaptive Reinforcement-Exploration Q-Learning Algorithm
    Zhang, Lieping
    Tang, Liu
    Zhang, Shenglan
    Wang, Zhengzhong
    Shen, Xianhao
    Zhang, Zuqiong
    SYMMETRY-BASEL, 2021, 13 (06):
  • [24] Adaptive Learning Recommendation Strategy Based on Deep Q-learning
    Tan, Chunxi
    Han, Ruijian
    Ye, Rougang
    Chen, Kani
    APPLIED PSYCHOLOGICAL MEASUREMENT, 2020, 44 (04) : 251 - 266
  • [25] Cognitive networks QoS multi-objective strategy based on Q-learning algorithm
    Wang, B. (wangbowx@163.com), 1600, Advanced Institute of Convergence Information Technology, Myoungbo Bldg 3F,, Bumin-dong 1-ga, Seo-gu, Busan, 602-816, Korea, Republic of (07):
  • [26] Path planning for mobile robot based on improved ant colony Q-learning algorithm
    Cui, Mengru
    He, Maowei
    Chen, Hanning
    Liu, Kunpeng
    Hu, Yabao
    Zheng, Chen
    Wang, Xuliang
    INTERNATIONAL JOURNAL OF INTERACTIVE DESIGN AND MANUFACTURING - IJIDEM, 2025, 19 (04): : 3069 - 3087
  • [27] An improved ant colony algorithm based on Q-Learning for route planning of autonomous vehicle
    Zhao, Liping
    Li, Feng
    Sun, Dongye
    Zhao, Zihan
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2024, 19 (03) : 1 - 15
  • [28] A Multiagent Dynamic Assessment Approach for Water Quality Based on Improved Q-Learning Algorithm
    Ni, Jianjun
    Ren, Li
    Liu, Minghua
    Zhu, Daqi
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2013, 2013
  • [29] A Q-learning algorithm for task scheduling based on improved SVM in wireless sensor networks
    Wei, Zhenchun
    Liu, Fei
    Zhang, Yan
    Xu, Juan
    Ji, Jianjun
    Lyu, Zengwei
    COMPUTER NETWORKS, 2019, 161 : 138 - 149
  • [30] Greedy exploration policy of Q-learning based on state balance
    Zheng, Yu
    Luo, Siwei
    Zhang, Jing
    TENCON 2005 - 2005 IEEE REGION 10 CONFERENCE, VOLS 1-5, 2006, : 2556 - +