An improved Q-learning algorithm based on exploration region expansion strategy

被引：0

作者：

Gao, Qingji ^{[1
]}

Hong, Bingong ^{[1
]}

He, Zhendong ^{[2
]}

Liu, Jie ^{[2
]}

Niu, Guochen ^{[2
]}

机构：

[1] Harbin Inst Technol, Dept Comp Sci & Technol, Harbin 150001, Peoples R China

[2] Civil Aviat Univ China, Inst Res Robot, Tianjin 300300, Peoples R China

来源：

WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS | 2006年

关键词：

Q-learning; exploration region expansion; exploration-exploitation; Metropolis criterion;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In order to find a good solution to one of the key problems in Q-learning algorithm-keeping the balance between exploration and exploitation, an improved Q-learning algorithm based on exploration region expansion strategy is proposed on the base of Metropolis criterion-based Q-learning. With this strategy, the exploration blindness in the entire environment is eliminated, and the learning efficiency is increased. Meanwhile, other feasible path is sought where agent encounters obstacles, which makes the implementation of the algorithm on real robot easy. An automatic termination condition is also put forward, therefore, the redundant learning after finding optimal path is avoided, and the time of learning is reduced. The validity of the algorithm is proved by simulation experiments.

引用

页码：4167 / +

页数：2

共 50 条

[21] Adaptive Routing Strategy Based on Improved Double Q-Learning for Satellite Internet of Things
Zhou, Jian
Gong, Xiaotian
Sun, Lijuan
Xie, Yong
Yan, Xiaoyong
SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
[22] An Improved Q-Learning Algorithm for Optimizing Sustainable Remanufacturing Systems
Qin, Shujin
Zhang, Xiaofei
Wang, Jiacun
Guo, Xiwang
Qi, Liang
Cao, Jinrui
Liu, Yizhi
SUSTAINABILITY, 2024, 16 (10)
[23] A Self-Adaptive Reinforcement-Exploration Q-Learning Algorithm
Zhang, Lieping
Tang, Liu
Zhang, Shenglan
Wang, Zhengzhong
Shen, Xianhao
Zhang, Zuqiong
SYMMETRY-BASEL, 2021, 13 (06):
[24] Adaptive Learning Recommendation Strategy Based on Deep Q-learning
Tan, Chunxi
Han, Ruijian
Ye, Rougang
Chen, Kani
APPLIED PSYCHOLOGICAL MEASUREMENT, 2020, 44 (04) : 251 - 266
[25] Cognitive networks QoS multi-objective strategy based on Q-learning algorithm
Wang, B. (wangbowx@163.com), 1600, Advanced Institute of Convergence Information Technology, Myoungbo Bldg 3F,, Bumin-dong 1-ga, Seo-gu, Busan, 602-816, Korea, Republic of (07):
[26] Path planning for mobile robot based on improved ant colony Q-learning algorithm
Cui, Mengru
He, Maowei
Chen, Hanning
Liu, Kunpeng
Hu, Yabao
Zheng, Chen
Wang, Xuliang
INTERNATIONAL JOURNAL OF INTERACTIVE DESIGN AND MANUFACTURING - IJIDEM, 2025, 19 (04): : 3069 - 3087
[27] An improved ant colony algorithm based on Q-Learning for route planning of autonomous vehicle
Zhao, Liping
Li, Feng
Sun, Dongye
Zhao, Zihan
INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2024, 19 (03) : 1 - 15
[28] A Multiagent Dynamic Assessment Approach for Water Quality Based on Improved Q-Learning Algorithm
Ni, Jianjun
Ren, Li
Liu, Minghua
Zhu, Daqi
MATHEMATICAL PROBLEMS IN ENGINEERING, 2013, 2013
[29] A Q-learning algorithm for task scheduling based on improved SVM in wireless sensor networks
Wei, Zhenchun
Liu, Fei
Zhang, Yan
Xu, Juan
Ji, Jianjun
Lyu, Zengwei
COMPUTER NETWORKS, 2019, 161 : 138 - 149
[30] Greedy exploration policy of Q-learning based on state balance
Zheng, Yu
Luo, Siwei
Zhang, Jing
TENCON 2005 - 2005 IEEE REGION 10 CONFERENCE, VOLS 1-5, 2006, : 2556 - +

← 1 2 3 4 5 →