Stochastic Integrated ActorCritic for Deep Reinforcement Learning

被引:5
|
作者
Zheng, Jiaohao [1 ]
Kurt, Mehmet Necip [2 ]
Wang, Xiaodong [2 ]
机构
[1] Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[2] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA
关键词
Training; Task analysis; Complexity theory; Linear programming; Network architecture; Decoding; Tensors; Actor-critic; adaptive objective; deep reinforcement learning (RL); integrated network; mixed on-off policy exploration; sample complexity;
D O I
10.1109/TNNLS.2022.3212273
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a deep stochastic actor-critic algorithm with an integrated network architecture and fewer parameters. We address stabilization of the learning procedure via an adaptive objective to the critic's loss and a smaller learning rate for the shared parameters between the actor and the critic. Moreover, we propose a mixed on-off policy exploration strategy to speed up learning. Experiments illustrate that our algorithm reduces the sample complexity by 50%-93% compared with the state-of-the-art deep reinforcement learning (RL) algorithms twin delayed deep deterministic policy gradient (TD3), soft actor-critic (SAC), proximal policy optimization (PPO), advantage actor-critic (A2C), and interpolated policy gradient (IPG) over continuous control tasks LunarLander, BipedalWalker, BipedalWalkerHardCore, Ant, and Minitaur in the OpenAI Gym.
引用
收藏
页码:6654 / 6666
页数:13
相关论文
共 50 条
  • [1] Robust Reward-Free ActorCritic for Cooperative Multiagent Reinforcement Learning
    Lin, Qifeng
    Ling, Qing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (12) : 17318 - 17329
  • [2] Robust Reward-Free ActorCritic for Cooperative Multiagent Reinforcement Learning
    Lin, Qifeng
    Ling, Qing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (12) : 17318 - 17329
  • [3] Integrated Actor-Critic for Deep Reinforcement Learning
    Zheng, Jiaohao
    Kurt, Mehmet Necip
    Wang, Xiaodong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 505 - 518
  • [4] Graph Soft ActorCritic Reinforcement Learning for Large-Scale Distributed Multirobot Coordination
    Hu, Yifan
    Fu, Junjie
    Wen, Guanghui
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 665 - 676
  • [5] Stochastic inversion of magnetotelluric data using deep reinforcement learning
    Wang, Han
    Liu, Yunhe
    Yin, Changchun
    Li, Jinfeng
    Su, Yang
    Xiong, Bin
    GEOPHYSICS, 2022, 87 (01) : E49 - E61
  • [6] Stochastic inversion of magnetotelluric data using deep reinforcement learning
    Wang H.
    Liu Y.
    Yin C.
    Li J.
    Su Y.
    Xiong B.
    Geophysics, 2021, 87 (01) : 1 - 52
  • [7] Integrated Guidance and Control for Missile Using Deep Reinforcement Learning
    Pei P.
    He S.-M.
    Wang J.
    Lin D.-F.
    Yuhang Xuebao/Journal of Astronautics, 2021, 42 (10): : 1293 - 1304
  • [8] Analog Integrated Circuit Topology Synthesis With Deep Reinforcement Learning
    Zhao, Zhenxin
    Zhang, Lihong
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (12) : 5138 - 5151
  • [9] Stochastic Reinforcement Learning
    Kuang, Nikki Lijing
    Leung, Clement H. C.
    Sung, Vienne W. K.
    2018 IEEE FIRST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE), 2018, : 244 - 248
  • [10] Deep Reinforcement Learning Approach for Integrated Updraft Mapping and Exploitation
    Notter, Stefan
    Gall, Christian
    Mueller, Gregor
    Ahmad, Aamir
    Fichter, Walter
    JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2023, 46 (10) : 1997 - 2004