An improved Q-learning algorithm based on exploration region expansion strategy

被引：0

作者：

Gao, Qingji ^{[1
]}

Hong, Bingong ^{[1
]}

He, Zhendong ^{[2
]}

Liu, Jie ^{[2
]}

Niu, Guochen ^{[2
]}

机构：

[1] Harbin Inst Technol, Dept Comp Sci & Technol, Harbin 150001, Peoples R China

[2] Civil Aviat Univ China, Inst Res Robot, Tianjin 300300, Peoples R China

来源：

WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS | 2006年

关键词：

Q-learning; exploration region expansion; exploration-exploitation; Metropolis criterion;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In order to find a good solution to one of the key problems in Q-learning algorithm-keeping the balance between exploration and exploitation, an improved Q-learning algorithm based on exploration region expansion strategy is proposed on the base of Metropolis criterion-based Q-learning. With this strategy, the exploration blindness in the entire environment is eliminated, and the learning efficiency is increased. Meanwhile, other feasible path is sought where agent encounters obstacles, which makes the implementation of the algorithm on real robot easy. An automatic termination condition is also put forward, therefore, the redundant learning after finding optimal path is avoided, and the time of learning is reduced. The validity of the algorithm is proved by simulation experiments.

引用

页码：4167 / +

页数：2

共 50 条

[1] Improved Exploration Strategy for Q-Learning Based Multipath Routing in SDN Networks
Hassen, Houda
Meherzi, Soumaya
Jemaa, Zouhair Ben
JOURNAL OF NETWORK AND SYSTEMS MANAGEMENT, 2024, 32 (02)
[2] Improved Exploration Strategy for Q-Learning Based Multipath Routing in SDN Networks
Houda Hassen
Soumaya Meherzi
Zouhair Ben Jemaa
Journal of Network and Systems Management, 2024, 32
[3] An improved immune Q-learning algorithm
Ji, Zhengqiao
Wu, Q. M. Jonathan
Sid-Ahmed, Maher
2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 3330 - +
[4] Network Selection Algorithm Based on Improved Deep Q-Learning
Ma Bin
Chen Haibo
Zhang Chao
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2022, 44 (01) : 346 - 353
[5] A Path Planning Algorithm for UAV Based on Improved Q-Learning
Yan, Chao
Xiang, Xiaojia
2018 2ND INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION SCIENCES (ICRAS), 2018, : 46 - 50
[6] Deep Q learning cloud task scheduling algorithm based on improved exploration strategy
Cheng, Chenyu
Li, Gang
Fan, Jiaqing
JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2024, 24 (4-5) : 2095 - 2107
[7] Study of Cooperation Strategy of Robot Based on Parallel Q-Learning Algorithm
Wang, Shuda
Si, Feng
Yang, Jing
Wang, Shuoning
Yang, Jun
INTELLIGENT ROBOTICS AND APPLICATIONS, PT I, PROCEEDINGS, 2008, 5314 : 633 - 642
[8] The Improved Algorithm of Deep Q-learning Network Based on Eligibility Trace
Liu, Bingyan
Ye, Xiongbing
Zhou, Chifei
Liu, Yijing
Zhang, Qiyang
Dong, Fang
2020 6TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS (ICCAR), 2020, : 230 - 235
[9] PATH PLANNING OF MOBILE ROBOT BASED ON THE IMPROVED Q-LEARNING ALGORITHM
Chen, Chaorui
Wang, Dongshu
INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2022, 18 (03): : 687 - 702
[10] A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
Li, Ying
Wang, Hanyu
Fan, Jiahao
Geng, Yanyu
PLOS ONE, 2022, 17 (12):

← 1 2 3 4 5 →