Bayesian Strategy Networks Based Soft Actor-Critic Learning

被引：0

作者：

Yang, Qin ^{[1
]}

Parasuraman, Ramviyas ^{[2
]}

机构：

[1] Bradley Univ, Bradley Hall 195,1501 Bradley Ave, Peoria, IL 61625 USA

[2] Univ Georgia, 415 Boyd Res & Educ Ctr, Athens, GA 30602 USA

来源：

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY | 2024年 / 15卷 / 03期

关键词：

Strategy; bayesian networks; deep reinforcement learning; soft actor-critic; utility; expectation; REINFORCEMENT; LEVEL;

D O I：

10.1145/3643862

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Astrategy refers to the rules that the agent chooses the available actions to achieve goals. Adopting reasonable strategies is challenging but crucial for an intelligent agent with limited resources working in hazardous, unstructured, and dynamic environments to improve the system's utility, decrease the overall cost, and increase mission success probability. This article proposes a novel hierarchical strategy decomposition approach based on Bayesian chaining to separate an intricate policy into several simple sub-policies and organize their relationships as Bayesian strategy networks (BSN). We integrate this approach into the state-of-the-art DRL method-soft actor-critic (SAC), and build the corresponding Bayesian soft actor-critic (BSAC) model by organizing several sub-policies as a joint policy. Our method achieves the state-of-the-art performance on the standard continuous control benchmarks in the OpenAI Gym environment. The results demonstrate that the promising potential of the BSAC method significantly improves training efficiency. Furthermore, we extend the topic to the Multi-Agent systems (MAS), discussing the potential research fields and directions.

引用

页数：24

共 50 条

[1] PAC-Bayesian Soft Actor-Critic Learning
Tasdighi, Bahareh
Akgul, Abdullah
Haussmann, Manuel
Brink, Kenny Kazimirzak
Kandemir, Melih
SYMPOSIUM ON ADVANCES IN APPROXIMATE BAYESIAN INFERENCE, 2024, 253 : 127 - 145
[2] Bayesian Soft Actor-Critic: A Directed Acyclic Strategy Graph Based Deep Reinforcement Learning
Yang, Qin
Parasuraman, Ramviyas
39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 646 - 648
[3] Latent Context Based Soft Actor-Critic
Pu, Yuan
Wang, Shaochen
Yao, Xin
Li, Bin
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[4] Model-Based Soft Actor-Critic
Chien, Jen-Tzung
Yang, Shu-Hsiang
2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 2028 - 2035
[5] Regularized Soft Actor-Critic for Behavior Transfer Learning
Tan, Mingxi
Tian, Andong
Denoyer, Ludovic
2022 IEEE CONFERENCE ON GAMES, COG, 2022, : 516 - 519
[6] Averaged Soft Actor-Critic for Deep Reinforcement Learning
Ding, Feng
Ma, Guanfeng
Chen, Zhikui
Gao, Jing
Li, Peng
COMPLEXITY, 2021, 2021
[7] An improved Soft Actor-Critic strategy for optimal energy management
Boato, Bruno
Sueldo, Carolina Saavedra
Avila, Luis
de Paula, Mariano
IEEE LATIN AMERICA TRANSACTIONS, 2023, 21 (09) : 958 - 965
[8] Actor-Critic based Improper Reinforcement Learning
Zaki, Mohammadi
Mohan, Avinash
Gopalan, Aditya
Mannor, Shie
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[9] Soft Actor-Critic Learning-Based Joint Computing, Pushing, and Caching Framework in MEC Networks
Gao, Xiangyu
Sun, Yaping
Chen, Hao
Xu, Xiaodong
Cui, Shuguang
IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 1675 - 1680
[10] Generative Adversarial Soft Actor-Critic
Hwang, Hyo-Seok
Kim, Yoojoong
Seok, Junhee
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,

← 1 2 3 4 5 →