A GRADIENT DESCENT SARSA(λ) ALGORITHM BASED ON THE ADAPTIVE REWARD-SHAPING MECHANISM

被引：0

作者：

Liu, Quan ^{[1
]}

Fu, QiMing ^{[1
]}

Xiao, Fei ^{[1
]}

Fu, YuChen ^{[1
]}

机构：

[1] Soochow Univ, Dept Comp Sci & Technol, Suzhou, Peoples R China

来源：

INTELLIGENT AUTOMATION AND SOFT COMPUTING | 2013年 / 19卷 / 04期

关键词：

reinforcement learning; Sarsa (lambda); gradient descent; reward-shaping; adaptive;

D O I：

10.1080/10798587.2013.869119

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Based on the adaptive reward-shaping mechanism, we propose a novel gradient descent (GD) Sarsa(lambda) algorithm to solve the problems of ill initial performance and low convergence speed in the reinforcement learning tasks with continuous state space. Adaptive normalized radial basis function (ANRBF) network is used to shape reward. The reward-shaping mechanism propagates model knowledge to the learner in the form of the additional reward signal so that the initial performance and convergence speed can be improved effectively. A function approximation algorithm named ANRBF-GD-Sarsa(lambda) is proposed based on the ANRBF network. The convergence of ANRBF-GD-Sarsa(lambda) is analyzed theoretically. Experiments are conducted to show the good initial performance and high convergence speed of the proposed algorithm.

引用

页码：599 / 612

页数：14

共 50 条

[41] A Feature Extraction Algorithm for Corner Cracks in Slabs Based on Multi-Scale Adaptive Gradient Descent
Zeng, Kai
Xia, Zibo
Qian, Junlei
Du, Xueqiang
Xiao, Pengcheng
Zhu, Liguang
METALS, 2025, 15 (03)
[42] Analysis on residual error for adaptive optical system based on stochastic parallel gradient descent control algorithm
Zhou P.
Wang X.
Ma Y.
Ma H.
Xu X.
Liu Z.
Guangxue Xuebao/Acta Optica Sinica, 2010, 30 (03): : 631 - 617
[43] An adaptive gradient descent-based local search in memetic algorithm applied to optimal controller design
Arab, Aliasghar
Alfi, Alireza
INFORMATION SCIENCES, 2015, 299 : 117 - 142
[44] Recurrent neural tracking control based on multivariable robust adaptive gradient-descent training algorithm
Xu, Zhao
Song, Qing
Wang, Danwei
NEURAL COMPUTING & APPLICATIONS, 2012, 21 (07): : 1745 - 1755
[45] Adaptive Gradient Descent Algorithm for Networked Control Systems Using Redundant Rule
Lv, Lixin
Zhang, Jian
IEEE ACCESS, 2021, 9 : 41669 - 41675
[46] Gradient Descent Algorithm Inspired Adaptive Time Synchronization in Wireless Sensor Networks
Yildirim, Kasim Sinan
IEEE SENSORS JOURNAL, 2016, 16 (13) : 5463 - 5470
[47] Optimization of stochastic parallel gradient descent algorithm for adaptive optics in atmospheric turbulence
Chen B.
Li X.
Jiang W.
Zhongguo Jiguang/Chinese Journal of Lasers, 2010, 37 (04): : 959 - 964
[48] Theoretical Analysis of Stochastic Parallel Gradient Descent Control Algorithm in Adaptive Optics
Yang, Huizhen
Li, Xinyang
PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL II, 2009, : 338 - +
[49] Adaptive optical confocal fluorescence microscope with stochastic parallel gradient descent algorithm
He, Yi
Wang, Zhibin
Wei, Ling
Li, Xiqi
Yang, Jinsheng
Zhang, Yudong
2016 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE (ACP), 2016,
[50] Adaptive wavefront correction using a VLSI implementation of the parallel gradient descent algorithm
Carhart, GW
Vorontsov, MA
Cohen, M
Cauwenberghs, G
Edwards, RT
HIGH-RESOLUTION WAVEFRONT CONTROL: METHODS, DEVICES, AND APPLICATIONS, 1999, 3760 : 61 - 66

← 1 2 3 4 5 →