SOFT ACTOR-CRITIC ALGORITHM WITH ADAPTIVE NORMALIZATION

被引:0
|
作者
Gao, Xiaonan [1 ]
Wu, Ziyi [1 ]
Zhu, Xianchao [1 ]
Cai, Lei [2 ]
机构
[1] Henan Univ Technol, Sch Artificial Intelligence & Big Data, Zhengzhou 450001, Peoples R China
[2] Henan Inst Sci & Technol, Sch Artificial Intelligence, Xinxiang 453003, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Adaptive normalization; Deep reinforcement learning; Reward mechanism; Soft actor-critic algorithm; GAME; GO;
D O I
10.23952/jnfa.2025.6
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In recent years, breakthroughs were made in the field of deep reinforcement learning, but, their applications in the real world were seriously affected due to the instability of algorithms and the difficulty in ensuring convergence. As a typical algorithm in reinforcement learning, although the SAC algorithm enhances the robustness and agent's exploration ability by introducing the concept of maximum entropy, it still has the disadvantage of instability in the training process. In order to solve the problems, this paper proposes an Adaptive Normalization-based SAC (AN-SAC) algorithm. By introducing the adaptive normalized reward mechanism into the SAC algorithm, our method can dynamically adjust the normalized parameters of the reward during the training process so that the reward value has zero mean and unit variance. Thus it better adapts to the reward distribution and improves the performance and stability of the algorithm. Experimental results demonstrate that the performance and stability of the AN-SAC algorithm are significantly improved compared with the SAC algorithm.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Energy optimization management of microgrid using improved soft actor-critic algorithm
    Yu, Zhiwen
    Zheng, Wenjie
    Zeng, Kaiwen
    Zhao, Ruifeng
    Zhang, Yanxu
    Zeng, Mengdi
    INTERNATIONAL JOURNAL OF RENEWABLE ENERGY DEVELOPMENT-IJRED, 2024, 13 (02): : 329 - 339
  • [22] Optimal scheduling of virtual power plant based on Soft Actor-Critic algorithm
    Pan, Pengfei
    Song, Minggang
    Zou, Nan
    Qin, Junhan
    Li, Guangdi
    Ma, Hongyuan
    2024 6TH ASIA ENERGY AND ELECTRICAL ENGINEERING SYMPOSIUM, AEEES 2024, 2024, : 835 - 840
  • [23] The soft actor-critic algorithm for automatic mode-locked fiber lasers
    Li, Jin
    Chang, Kun
    Liu, Congcong
    Ning, Yu
    Ma, Yuansheng
    He, Jiangyong
    Liu, Yange
    Wang, Zhi
    OPTICAL FIBER TECHNOLOGY, 2023, 81
  • [24] Actor-Critic Algorithm with Transition Cost Estimation
    Sergey, Denisov
    Lee, Jee-Hyong
    INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2016, 16 (04) : 270 - 275
  • [25] The Effect of Discounting Actor-loss in Actor-Critic Algorithm
    Yaputra, Jordi
    Suyanto, Suyanto
    2021 4TH INTERNATIONAL SEMINAR ON RESEARCH OF INFORMATION TECHNOLOGY AND INTELLIGENT SYSTEMS (ISRITI 2021), 2020,
  • [26] A modified actor-critic reinforcement learning algorithm
    Mustapha, SM
    Lachiver, G
    2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 605 - 609
  • [27] RUDOLF: An Efficient and Adaptive Defense Approach Against Website Fingerprinting Attacks Based on Soft Actor-Critic Algorithm
    Jiang, Meiyi
    Cui, Baojiang
    Fu, Junsong
    Wang, Tao
    Yao, Lu
    Bhargava, Bharat K.
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 7794 - 7809
  • [28] ISAACS: Iterative Soft Adversarial Actor-Critic for Safety
    Hsu, Kai-Chieh
    Nguyen, Duy P.
    Fisac, Jaime F.
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [29] Multiagent Soft Actor-Critic for Traffic Light Timing
    Wu, Lan
    Wu, Yuanming
    Qiao, Cong
    Tian, Yafang
    JOURNAL OF TRANSPORTATION ENGINEERING PART A-SYSTEMS, 2023, 149 (02)
  • [30] Characterizing Motor Control of Mastication With Soft Actor-Critic
    Abdi, Amir H.
    Sagl, Benedikt
    Srungarapu, Venkata P.
    Stavness, Ian
    Prisman, Eitan
    Abolmaesumi, Purang
    Fels, Sidney
    FRONTIERS IN HUMAN NEUROSCIENCE, 2020, 14