SOFT ACTOR-CRITIC ALGORITHM WITH ADAPTIVE NORMALIZATION

被引：0

作者：

Gao, Xiaonan ^{[1
]}

Wu, Ziyi ^{[1
]}

Zhu, Xianchao ^{[1
]}

Cai, Lei ^{[2
]}

机构：

[1] Henan Univ Technol, Sch Artificial Intelligence & Big Data, Zhengzhou 450001, Peoples R China

[2] Henan Inst Sci & Technol, Sch Artificial Intelligence, Xinxiang 453003, Peoples R China

来源：

JOURNAL OF NONLINEAR FUNCTIONAL ANALYSIS | 2025年 / 2025卷

基金：

中国国家自然科学基金;

关键词：

Adaptive normalization; Deep reinforcement learning; Reward mechanism; Soft actor-critic algorithm; GAME; GO;

D O I：

10.23952/jnfa.2025.6

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

In recent years, breakthroughs were made in the field of deep reinforcement learning, but, their applications in the real world were seriously affected due to the instability of algorithms and the difficulty in ensuring convergence. As a typical algorithm in reinforcement learning, although the SAC algorithm enhances the robustness and agent's exploration ability by introducing the concept of maximum entropy, it still has the disadvantage of instability in the training process. In order to solve the problems, this paper proposes an Adaptive Normalization-based SAC (AN-SAC) algorithm. By introducing the adaptive normalized reward mechanism into the SAC algorithm, our method can dynamically adjust the normalized parameters of the reward during the training process so that the reward value has zero mean and unit variance. Thus it better adapts to the reward distribution and improves the performance and stability of the algorithm. Experimental results demonstrate that the performance and stability of the AN-SAC algorithm are significantly improved compared with the SAC algorithm.

引用

页数：10

共 50 条

[41] Natural Actor-Critic
Peters, J
Vijayakumar, S
Schaal, S
MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 280 - 291
[42] A deep residual reinforcement learning algorithm based on Soft Actor-Critic for autonomous navigation
Wen, Shuhuan
Shu, Yili
Rad, Ahmad
Wen, Zeteng
Guo, Zhengzheng
Gong, Simeng
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 259
[43] Energy-efficient train control method based on soft actor-critic algorithm
Zhu, Q.
Su, S.
Tang, T.
Xiao, X.
2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2423 - 2428
[44] An efficient and adaptive design of reinforcement learning environment to solve job shop scheduling problem with soft actor-critic algorithm
Si, Jinghua
Li, Xinyu
Gao, Liang
Li, Peigen
INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2024, 62 (23) : 8260 - 8275
[45] An Actor-Critic Algorithm for the Stochastic Cutting Stock Problem
Su, Jie-Ying
Kang, Jia-Lin
Jang, Shi-Shang
PROCESSES, 2023, 11 (04)
[46] An actor-critic algorithm for constrained Markov decision processes
Borkar, VS
SYSTEMS & CONTROL LETTERS, 2005, 54 (03) : 207 - 213
[47] On Finite-Time Convergence of Actor-Critic Algorithm
Qiu S.
Yang Z.
Ye J.
Wang Z.
IEEE Journal on Selected Areas in Information Theory, 2021, 2 (02): : 652 - 664
[48] Adaptive Inverse Optimal Control for Rehabilitation Robot Systems Using Actor-Critic Algorithm
Meng, Fancheng
Dai, Yaping
MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
[49] Actor-Critic Algorithm with Maximum-Entropy Correction
Jiang Y.-B.
Liu Q.
Hu Z.-H.
Liu, Quan (quanliu@suda.edu.cn), 1897, Science Press (43): : 1897 - 1908
[50] Actor-Critic Algorithm for Optimal Synchronization of Kuramoto Oscillator
Vrushabh, D.
Shalini, K.
Sonam, K.
2020 7TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT'20), VOL 1, 2020, : 391 - 396

← 1 2 3 4 5 →