Multi-AUV Pursuit-Evasion Game in the Internet of Underwater Things: An Efficient Training Framework via Offline Reinforcement Learning

被引:2
|
作者
Xu, Jingzehua [1 ]
Zhang, Zekai [1 ]
Wang, Jingjing [2 ,3 ]
Han, Zhu [4 ,5 ]
Ren, Yong [6 ]
机构
[1] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China
[2] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100191, Peoples R China
[3] Beihang Univ, Hangzhou Innovat Inst, Hangzhou 310051, Peoples R China
[4] Univ Houston, Dept Elect & Comp Engn, Houston, TX 77004 USA
[5] Kyung Hee Univ, Dept Comp Sci & Engn, Seoul 446701, South Korea
[6] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
来源
IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 19期
基金
日本科学技术振兴机构; 中国国家自然科学基金;
关键词
Games; Training; Target tracking; Sensors; Task analysis; Internet of Things; Transformers; Autonomous underwater vehicle (AUV); decision transformer (DT); finite-horizon Markov game process (FMGP); offline reinforcement learning (ORL); pursuit-evasion game;
D O I
10.1109/JIOT.2024.3416616
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we investigate the pursuit-evasion game of multiple autonomous underwater vehicles (AUVs) in a complex ocean environment. The pursuer AUVs need to optimize their trajectories to avoid obstacles and dangerous vortex regions in the environment in order to pursue the escaper AUV. Both the pursuer and escaper can sense each other with limited detection capabilities for further pursuit or escape. As the underwater pursuit-evasion (UPE) game is a high-dimensional NP-hard problem, we innovatively transform it into a finite-horizon Markov game process and propose a decentralized training and decentralized execution efficient training framework based on the offline reinforcement learning. During the training process, we propose multiagent independent soft actor-critic to facilitate policy improvement and generate the offline data set, and propose multiagent independent decision transformer for model training in the UPE game. Extensive simulations demonstrate the scalability and generalization ability of our proposed training framework, which can achieve excellent performance in the UPE games under different conditions and environments with only a few AUVs participating in policy improvement to generate the high-quality offline data set.
引用
收藏
页码:31273 / 31286
页数:14
相关论文
共 48 条
  • [21] Decentralized optimal large scale multi-player pursuit-evasion strategies: A mean field game approach with reinforcement learning
    Zhou, Zejian
    Xu, Hao
    NEUROCOMPUTING, 2022, 484 : 46 - 58
  • [22] Reinforcement learning-based decision-making for spacecraft pursuit-evasion game in elliptical orbits
    Yu, Weizhuo
    Liu, Chuang
    Yue, Xiaokui
    CONTROL ENGINEERING PRACTICE, 2024, 153
  • [23] Game of Drones: UAV Pursuit-Evasion Game With Type-2 Fuzzy Logic Controllers Tuned by Reinforcement Learning
    Camci, Efe
    Kayacan, Erdal
    2016 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2016, : 618 - 625
  • [24] Optimal game theoretic solution of the pursuit-evasion intercept problem using on-policy reinforcement learning
    Kartal, Yusuf
    Subbarao, Kamesh
    Dogan, Atilla
    Lewis, Frank
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (16) : 7886 - 7903
  • [25] An Improved Approach towards Multi-Agent Pursuit-Evasion Game Decision-Making Using Deep Reinforcement Learning
    Wan, Kaifang
    Wu, Dingwei
    Zhai, Yiwei
    Li, Bo
    Gao, Xiaoguang
    Hu, Zijian
    ENTROPY, 2021, 23 (11)
  • [26] Underwater Target Tracking Based on Hierarchical Software-Defined Multi-AUV Reinforcement Learning: A Multi-AUV Advantage-Attention Actor-Critic Approach
    Zhu, Shengchao
    Han, Guangjie
    Lin, Chuan
    Tao, Qiuzi
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (12) : 13639 - 13653
  • [27] AUV-Aided Localization for Internet of Underwater Things: A Reinforcement-Learning-Based Method
    Yan, Jing
    Gong, Yadi
    Chen, Cailian
    Luo, Xiaoyuan
    Guan, Xinping
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (10): : 9728 - 9746
  • [28] Reinforcement Learning based Anti-UAV Three-dimensional Pursuit-evasion Game for Substation Security
    Dong, Qingxue
    2024 5th International Conference on Mechatronics Technology and Intelligent Manufacturing, ICMTIM 2024, 2024, : 224 - 227
  • [29] Reinforcement Learning based Anti-UAV Three-dimensional Pursuit-evasion Game for Substation Security
    Dong, Qingxue
    2024 5TH INTERNATIONAL CONFERENCE ON MECHATRONICS TECHNOLOGY AND INTELLIGENT MANUFACTURING, ICMTIM 2024, 2024, : 224 - 227
  • [30] Strategy solution of non-cooperative target pursuit-evasion game based on branching deep reinforcement learning
    Liu B.
    Ye X.
    Gao Y.
    Wang X.
    Ni L.
    Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2020, 41 (10):