Multi-AUV Pursuit-Evasion Game in the Internet of Underwater Things: An Efficient Training Framework via Offline Reinforcement Learning

被引:2
|
作者
Xu, Jingzehua [1 ]
Zhang, Zekai [1 ]
Wang, Jingjing [2 ,3 ]
Han, Zhu [4 ,5 ]
Ren, Yong [6 ]
机构
[1] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China
[2] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100191, Peoples R China
[3] Beihang Univ, Hangzhou Innovat Inst, Hangzhou 310051, Peoples R China
[4] Univ Houston, Dept Elect & Comp Engn, Houston, TX 77004 USA
[5] Kyung Hee Univ, Dept Comp Sci & Engn, Seoul 446701, South Korea
[6] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
来源
IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 19期
基金
日本科学技术振兴机构; 中国国家自然科学基金;
关键词
Games; Training; Target tracking; Sensors; Task analysis; Internet of Things; Transformers; Autonomous underwater vehicle (AUV); decision transformer (DT); finite-horizon Markov game process (FMGP); offline reinforcement learning (ORL); pursuit-evasion game;
D O I
10.1109/JIOT.2024.3416616
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we investigate the pursuit-evasion game of multiple autonomous underwater vehicles (AUVs) in a complex ocean environment. The pursuer AUVs need to optimize their trajectories to avoid obstacles and dangerous vortex regions in the environment in order to pursue the escaper AUV. Both the pursuer and escaper can sense each other with limited detection capabilities for further pursuit or escape. As the underwater pursuit-evasion (UPE) game is a high-dimensional NP-hard problem, we innovatively transform it into a finite-horizon Markov game process and propose a decentralized training and decentralized execution efficient training framework based on the offline reinforcement learning. During the training process, we propose multiagent independent soft actor-critic to facilitate policy improvement and generate the offline data set, and propose multiagent independent decision transformer for model training in the UPE game. Extensive simulations demonstrate the scalability and generalization ability of our proposed training framework, which can achieve excellent performance in the UPE games under different conditions and environments with only a few AUVs participating in policy improvement to generate the high-quality offline data set.
引用
收藏
页码:31273 / 31286
页数:14
相关论文
共 48 条
  • [11] Cooperative control for multi-player pursuit-evasion games with reinforcement learning
    Wang, Yuanda
    Dong, Lu
    Sun, Changyin
    NEUROCOMPUTING, 2020, 412 : 101 - 114
  • [12] Pursuit-Evasion Games for Multi-agent Based on Reinforcement Learning with Obstacles
    Hu, Penglin
    Guo, Yaning
    Hu, Jinwen
    Pan, Quan
    PROCEEDINGS OF 2022 INTERNATIONAL CONFERENCE ON AUTONOMOUS UNMANNED SYSTEMS, ICAUS 2022, 2023, 1010 : 1015 - 1024
  • [13] Collaborative Multi-AUV Optical Communication via Deep Reinforcement Learning
    Li, Mengzhen
    Luo, Hanjiang
    Tao, Hang
    Li, Xiang
    Dong, Peijun
    Wu, Kaishun
    IEEE SENSORS JOURNAL, 2025, 25 (01) : 1627 - 1640
  • [14] Using Cognitive Behavioral Learning in Multi-Agent Pursuit-Evasion Game
    Kuo, Jong Yih
    Liu, Chien-Hung
    Lee, Fang-Wen
    ASIA MODELLING SYMPOSIUM 2014 (AMS 2014), 2014, : 16 - 20
  • [15] An Efficient Multi-AUV Cooperative Navigation Method Based on Hierarchical Reinforcement Learning
    Zhu, Zixiao
    Zhang, Lichuan
    Liu, Lu
    Wu, Dongwei
    Bai, Shuchang
    Ren, Ranzhen
    Geng, Wenlong
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (10)
  • [16] Terminal-guidance Based Reinforcement-learning for Orbital Pursuit-evasion Game of the Spacecraft
    Geng Y.-Z.
    Yuan L.
    Huang H.
    Tang L.
    Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (05): : 974 - 984
  • [17] Integral reinforcement learning based dynamic stackelberg pursuit-evasion game for unmanned surface vehicles
    Hu, Xiaoxiang
    Liu, Shuaizheng
    Xu, Jingwen
    Xiao, Bing
    Guo, Chenguang
    ALEXANDRIA ENGINEERING JOURNAL, 2024, 108 : 428 - 435
  • [18] Apollonius partitions based pursuit-evasion strategies via multi-agent reinforcement
    Xue, Lei
    Wang, Qing
    Wu, Yongbao
    Yuan, Xin
    Liu, Jian
    NEUROCOMPUTING, 2025, 630
  • [19] Underwater Target Tracking Based on Interrupted Software-Defined Multi-AUV Reinforcement Learning: A Multi-AUV Time-Saving MARL Approach
    Zhu, Shengchao
    Han, Guangjie
    Lin, Chuan
    Zhang, Yu
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2025, 24 (03) : 2124 - 2136
  • [20] Game Theoretic Reinforcement Learning Framework For Industrial Internet of Things
    Tai Manh Ho
    Kim-Khoa Nguyen
    Cheriet, Mohamed
    2022 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2022, : 2112 - 2117