Bayesian Strategy Networks Based Soft Actor-Critic Learning

被引:0
|
作者
Yang, Qin [1 ]
Parasuraman, Ramviyas [2 ]
机构
[1] Bradley Univ, Bradley Hall 195,1501 Bradley Ave, Peoria, IL 61625 USA
[2] Univ Georgia, 415 Boyd Res & Educ Ctr, Athens, GA 30602 USA
关键词
Strategy; bayesian networks; deep reinforcement learning; soft actor-critic; utility; expectation; REINFORCEMENT; LEVEL;
D O I
10.1145/3643862
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Astrategy refers to the rules that the agent chooses the available actions to achieve goals. Adopting reasonable strategies is challenging but crucial for an intelligent agent with limited resources working in hazardous, unstructured, and dynamic environments to improve the system's utility, decrease the overall cost, and increase mission success probability. This article proposes a novel hierarchical strategy decomposition approach based on Bayesian chaining to separate an intricate policy into several simple sub-policies and organize their relationships as Bayesian strategy networks (BSN). We integrate this approach into the state-of-the-art DRL method-soft actor-critic (SAC), and build the corresponding Bayesian soft actor-critic (BSAC) model by organizing several sub-policies as a joint policy. Our method achieves the state-of-the-art performance on the standard continuous control benchmarks in the OpenAI Gym environment. The results demonstrate that the promising potential of the BSAC method significantly improves training efficiency. Furthermore, we extend the topic to the Multi-Agent systems (MAS), discussing the potential research fields and directions.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] Meta Soft Actor-Critic Based Robust Sequential Power Control in Vehicular Networks
    Liu, Zhihua
    Guo, Chongtao
    Guo, Cheng
    Liu, Zhaoyang
    Wang, Xijun
    2023 IEEE 98TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2023-FALL, 2023,
  • [22] Soft Actor-Critic Reinforcement Learning-Based Optimization for Analog Circuit Sizing
    Park, Sejin
    Choi, Youngchang
    Kang, Seokhyeong
    2023 20TH INTERNATIONAL SOC DESIGN CONFERENCE, ISOCC, 2023, : 47 - 48
  • [23] Soft Actor-Critic with Inhibitory Networks for Retraining UAV Controllers Faster
    Choi, Minkyu
    Filter, Max
    Alcedo, Kevin
    Walker, Thayne T.
    Rosenbluth, David
    Ide, Jaime S.
    2022 INTERNATIONAL CONFERENCE ON UNMANNED AIRCRAFT SYSTEMS (ICUAS), 2022, : 1561 - 1570
  • [24] Soft Actor-Critic for Navigation of Mobile Robots
    de Jesus, Junior Costa
    Kich, Victor Augusto
    Kolling, Alisson Henrique
    Grando, Ricardo Bedin
    Cuadros, Marco Antonio de Souza Leite
    Gamarra, Daniel Fernando Tello
    Journal of Intelligent and Robotic Systems: Theory and Applications, 2021, 102 (02):
  • [25] Taming chimeras in coupled oscillators using soft actor-critic based reinforcement learning
    Ding, Jianpeng
    Lei, Youming
    Small, Michael
    CHAOS, 2025, 35 (01)
  • [26] A deep residual reinforcement learning algorithm based on Soft Actor-Critic for autonomous navigation
    Wen, Shuhuan
    Shu, Yili
    Rad, Ahmad
    Wen, Zeteng
    Guo, Zhengzheng
    Gong, Simeng
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 259
  • [27] Simultaneous Control and Guidance of an AUV Based on Soft Actor-Critic
    Sola, Yoann
    Le Chenadec, Gilles
    Clement, Benoit
    SENSORS, 2022, 22 (16)
  • [28] Soft Actor-Critic for Navigation of Mobile Robots
    Junior Costa de Jesus
    Victor Augusto Kich
    Alisson Henrique Kolling
    Ricardo Bedin Grando
    Marco Antonio de Souza Leite Cuadros
    Daniel Fernando Tello Gamarra
    Journal of Intelligent & Robotic Systems, 2021, 102
  • [29] Actor-critic learning based on fuzzy inference system
    Jouffe, L
    INFORMATION INTELLIGENCE AND SYSTEMS, VOLS 1-4, 1996, : 339 - 344
  • [30] Soft Actor-Critic for Navigation of Mobile Robots
    de Jesus, Junior Costa
    Kich, Victor Augusto
    Kolling, Alisson Henrique
    Grando, Ricardo Bedin
    Cuadros, Marco Antonio de Souza Leite
    Gamarra, Daniel Fernando Tello
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2021, 102 (02)