Bayesian Soft Actor-Critic: A Directed Acyclic Strategy Graph Based Deep Reinforcement Learning

被引:0
|
作者
Yang, Qin [1 ]
Parasuraman, Ramviyas [2 ]
机构
[1] Braldey Univ, Comp Sci & Informat Syst Dept, Peoria, IL 61625 USA
[2] Univ Georgia, Dept Comp Sci, Athens, GA 30602 USA
关键词
Strategy; Bayesian Networks; Deep Reinforcement Learning; Soft Actor-Critic; Utility; Expectation;
D O I
10.1145/3605098.3636113
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Adopting reasonable strategies is challenging but crucial for an intelligent agent with limited resources working in hazardous, unstructured, and dynamic environments to improve the system's utility, decrease the overall cost, and increase mission success probability. This paper proposes a novel directed acyclic strategy graph decomposition approach based on Bayesian chaining to separate an intricate policy into several simple sub-policies and organize their relationships as Bayesian strategy networks (BSN). We integrate this approach into the state-of-the-art DRL method - soft actor-critic (SAC), and build the corresponding Bayesian soft actor-critic (BSAC) model by organizing several sub-policies as a joint policy. We compare our method against the state-of-the-art deep reinforcement learning algorithms on the standard continuous control benchmarks in the OpenAI Gym environment. The results demonstrate that the promising potential of the BSAC method significantly improves training efficiency.
引用
收藏
页码:646 / 648
页数:3
相关论文
共 50 条
  • [1] Bayesian Strategy Networks Based Soft Actor-Critic Learning
    Yang, Qin
    Parasuraman, Ramviyas
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (03)
  • [2] Averaged Soft Actor-Critic for Deep Reinforcement Learning
    Ding, Feng
    Ma, Guanfeng
    Chen, Zhikui
    Gao, Jing
    Li, Peng
    COMPLEXITY, 2021, 2021
  • [3] Integrated Actor-Critic for Deep Reinforcement Learning
    Zheng, Jiaohao
    Kurt, Mehmet Necip
    Wang, Xiaodong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 505 - 518
  • [4] A deep residual reinforcement learning algorithm based on Soft Actor-Critic for autonomous navigation
    Wen, Shuhuan
    Shu, Yili
    Rad, Ahmad
    Wen, Zeteng
    Guo, Zhengzheng
    Gong, Simeng
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 259
  • [5] Reinforcement learning with actor-critic for knowledge graph reasoning
    Zhang, Linli
    Li, Dewei
    Xi, Yugeng
    Jia, Shuai
    SCIENCE CHINA-INFORMATION SCIENCES, 2020, 63 (06)
  • [6] Reinforcement learning with actor-critic for knowledge graph reasoning
    Linli Zhang
    Dewei Li
    Yugeng Xi
    Shuai Jia
    Science China Information Sciences, 2020, 63
  • [7] Reinforcement learning with actor-critic for knowledge graph reasoning
    Linli ZHANG
    Dewei LI
    Yugeng XI
    Shuai JIA
    Science China(Information Sciences), 2020, 63 (06) : 223 - 225
  • [8] PAC-Bayesian Soft Actor-Critic Learning
    Tasdighi, Bahareh
    Akgul, Abdullah
    Haussmann, Manuel
    Brink, Kenny Kazimirzak
    Kandemir, Melih
    SYMPOSIUM ON ADVANCES IN APPROXIMATE BAYESIAN INFERENCE, 2024, 253 : 127 - 145
  • [9] Visual Navigation with Actor-Critic Deep Reinforcement Learning
    Shao, Kun
    Zhao, Dongbin
    Zhu, Yuanheng
    Zhang, Qichao
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [10] Actor-Critic based Improper Reinforcement Learning
    Zaki, Mohammadi
    Mohan, Avinash
    Gopalan, Aditya
    Mannor, Shie
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,