SOFT ACTOR-CRITIC ALGORITHM WITH ADAPTIVE NORMALIZATION

被引:0
|
作者
Gao, Xiaonan [1 ]
Wu, Ziyi [1 ]
Zhu, Xianchao [1 ]
Cai, Lei [2 ]
机构
[1] Henan Univ Technol, Sch Artificial Intelligence & Big Data, Zhengzhou 450001, Peoples R China
[2] Henan Inst Sci & Technol, Sch Artificial Intelligence, Xinxiang 453003, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Adaptive normalization; Deep reinforcement learning; Reward mechanism; Soft actor-critic algorithm; GAME; GO;
D O I
10.23952/jnfa.2025.6
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In recent years, breakthroughs were made in the field of deep reinforcement learning, but, their applications in the real world were seriously affected due to the instability of algorithms and the difficulty in ensuring convergence. As a typical algorithm in reinforcement learning, although the SAC algorithm enhances the robustness and agent's exploration ability by introducing the concept of maximum entropy, it still has the disadvantage of instability in the training process. In order to solve the problems, this paper proposes an Adaptive Normalization-based SAC (AN-SAC) algorithm. By introducing the adaptive normalized reward mechanism into the SAC algorithm, our method can dynamically adjust the normalized parameters of the reward during the training process so that the reward value has zero mean and unit variance. Thus it better adapts to the reward distribution and improves the performance and stability of the algorithm. Experimental results demonstrate that the performance and stability of the AN-SAC algorithm are significantly improved compared with the SAC algorithm.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Natural Actor-Critic
    Peters, J
    Vijayakumar, S
    Schaal, S
    MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 280 - 291
  • [42] A deep residual reinforcement learning algorithm based on Soft Actor-Critic for autonomous navigation
    Wen, Shuhuan
    Shu, Yili
    Rad, Ahmad
    Wen, Zeteng
    Guo, Zhengzheng
    Gong, Simeng
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 259
  • [43] Energy-efficient train control method based on soft actor-critic algorithm
    Zhu, Q.
    Su, S.
    Tang, T.
    Xiao, X.
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2423 - 2428
  • [44] An efficient and adaptive design of reinforcement learning environment to solve job shop scheduling problem with soft actor-critic algorithm
    Si, Jinghua
    Li, Xinyu
    Gao, Liang
    Li, Peigen
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2024, 62 (23) : 8260 - 8275
  • [45] An Actor-Critic Algorithm for the Stochastic Cutting Stock Problem
    Su, Jie-Ying
    Kang, Jia-Lin
    Jang, Shi-Shang
    PROCESSES, 2023, 11 (04)
  • [46] An actor-critic algorithm for constrained Markov decision processes
    Borkar, VS
    SYSTEMS & CONTROL LETTERS, 2005, 54 (03) : 207 - 213
  • [47] On Finite-Time Convergence of Actor-Critic Algorithm
    Qiu S.
    Yang Z.
    Ye J.
    Wang Z.
    IEEE Journal on Selected Areas in Information Theory, 2021, 2 (02): : 652 - 664
  • [48] Adaptive Inverse Optimal Control for Rehabilitation Robot Systems Using Actor-Critic Algorithm
    Meng, Fancheng
    Dai, Yaping
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
  • [49] Actor-Critic Algorithm with Maximum-Entropy Correction
    Jiang Y.-B.
    Liu Q.
    Hu Z.-H.
    Liu, Quan (quanliu@suda.edu.cn), 1897, Science Press (43): : 1897 - 1908
  • [50] Actor-Critic Algorithm for Optimal Synchronization of Kuramoto Oscillator
    Vrushabh, D.
    Shalini, K.
    Sonam, K.
    2020 7TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT'20), VOL 1, 2020, : 391 - 396