Bayesian Soft Actor-Critic: A Directed Acyclic Strategy Graph Based Deep Reinforcement Learning

被引：0

作者：

Yang, Qin ^{[1
]}

Parasuraman, Ramviyas ^{[2
]}

机构：

[1] Braldey Univ, Comp Sci & Informat Syst Dept, Peoria, IL 61625 USA

[2] Univ Georgia, Dept Comp Sci, Athens, GA 30602 USA

来源：

39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024 | 2024年

关键词：

Strategy; Bayesian Networks; Deep Reinforcement Learning; Soft Actor-Critic; Utility; Expectation;

D O I：

10.1145/3605098.3636113

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Adopting reasonable strategies is challenging but crucial for an intelligent agent with limited resources working in hazardous, unstructured, and dynamic environments to improve the system's utility, decrease the overall cost, and increase mission success probability. This paper proposes a novel directed acyclic strategy graph decomposition approach based on Bayesian chaining to separate an intricate policy into several simple sub-policies and organize their relationships as Bayesian strategy networks (BSN). We integrate this approach into the state-of-the-art DRL method - soft actor-critic (SAC), and build the corresponding Bayesian soft actor-critic (BSAC) model by organizing several sub-policies as a joint policy. We compare our method against the state-of-the-art deep reinforcement learning algorithms on the standard continuous control benchmarks in the OpenAI Gym environment. The results demonstrate that the promising potential of the BSAC method significantly improves training efficiency.

引用

页码：646 / 648

页数：3

共 50 条

[1] Bayesian Strategy Networks Based Soft Actor-Critic Learning
Yang, Qin
Parasuraman, Ramviyas
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (03)
[2] Averaged Soft Actor-Critic for Deep Reinforcement Learning
Ding, Feng
Ma, Guanfeng
Chen, Zhikui
Gao, Jing
Li, Peng
COMPLEXITY, 2021, 2021
[3] Integrated Actor-Critic for Deep Reinforcement Learning
Zheng, Jiaohao
Kurt, Mehmet Necip
Wang, Xiaodong
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 505 - 518
[4] A deep residual reinforcement learning algorithm based on Soft Actor-Critic for autonomous navigation
Wen, Shuhuan
Shu, Yili
Rad, Ahmad
Wen, Zeteng
Guo, Zhengzheng
Gong, Simeng
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 259
[5] Reinforcement learning with actor-critic for knowledge graph reasoning
Zhang, Linli
Li, Dewei
Xi, Yugeng
Jia, Shuai
SCIENCE CHINA-INFORMATION SCIENCES, 2020, 63 (06)
[6] Reinforcement learning with actor-critic for knowledge graph reasoning
Linli Zhang
Dewei Li
Yugeng Xi
Shuai Jia
Science China Information Sciences, 2020, 63
[7] Reinforcement learning with actor-critic for knowledge graph reasoning
Linli ZHANG
Dewei LI
Yugeng XI
Shuai JIA
Science China(Information Sciences), 2020, 63 (06) : 223 - 225
[8] PAC-Bayesian Soft Actor-Critic Learning
Tasdighi, Bahareh
Akgul, Abdullah
Haussmann, Manuel
Brink, Kenny Kazimirzak
Kandemir, Melih
SYMPOSIUM ON ADVANCES IN APPROXIMATE BAYESIAN INFERENCE, 2024, 253 : 127 - 145
[9] Visual Navigation with Actor-Critic Deep Reinforcement Learning
Shao, Kun
Zhao, Dongbin
Zhu, Yuanheng
Zhang, Qichao
2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
[10] Actor-Critic based Improper Reinforcement Learning
Zaki, Mohammadi
Mohan, Avinash
Gopalan, Aditya
Mannor, Shie
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,

← 1 2 3 4 5 →