Intelligent decision-making technology for wargame by integrating three-way multiple attribute decision-making and SAC

被引：0

作者：

Peng L. ^{[1
,2
]}

Sun Y. ^{[1
]}

Xue Y. ^{[1
]}

Zhou X. ^{[1
,3
]}

机构：

[1] School of Engineering Management, Nanjing University, Nanjing

[2] School of Information Technology & Artificial Intelligence, Zhejiang University of Finance & Economics, Hangzhou

[3] Research Center for New Technology in Intelligent Equipment, Nanjing University, Nanjing

来源：

Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics | 2024年 / 46卷 / 07期

关键词：

intelligent decision; reinforcement learning (RL); soft actor-critic (SAC); three-way multiple attribute decision making (TWMADM); wargame;

D O I：

10.12305/j.issn.1001-506X.2024.07.15

中图分类号：

学科分类号：

摘要：

In recent years, the generation of intelligent confrontation strategies using deep reinforcement learning technology for wargaming has attracted widespread attention. Aiming at the problems of low sampling rate, slow training convergence of reinforcement learning decision model and low game winning rate of agents, an intelligent decision-making technology integrating three-way multiple attribute decision making (TWMADM) and reinforcement learning is proposed. Based on the classical soft actor-critic (SAC) algorithm, the wargaming agent is developed, and the threat situation of the opposing operator is evaluated by using TWMADM method, and the threat assessment results are introduced into the SAC algorithm in the form of prior knowledge to plan tactical decisions. A game confrontation experiment is conducted in a typical wargame system, and the results shows that the proposed algorithm can effectively speed up the training convergence, improve the efficiency of generating adversarial strategies and the game winning rate for agents. © 2024 Chinese Institute of Electronics. All rights reserved.

引用

页码：2310 / 2322

页数：12

共 35 条

[1] LI C, HUANG Y Y, ZHANG Y L, Et al., Multi-agent decision-making method based on Actor-Critic framework and its application in wargame [J], Systems Engineering and Electronics, 43, 3, pp. 755-762, (2021)
[2] SILVER D, HUANG A, MADDISON C J, Et al., Mastering the game of Go with deep neural networks and tree search, Nature, 529, 7587, pp. 484-489, (2016)
[3] HU X F, HE X Y, TAO J Y., AiphaGo' s breakthrough and challenges of wargaming, Science & Technology Review, 35, 21, pp. 49-60, (2017)
[4] SUN Y X, PENG Y II, LI B, Et al., Overview of intelligent game
[5] enlightenment of game Al to combat deduction, Chinese Journal of Intelligent Science and Technology, 4, 2, pp. 157-173, (2022)
[6] SILVER D, SCHRITTWIESER J, SIMONYAN K, Et al., Mastering the game of go without human knowledge, Nature, 550, 7676, pp. 354-359, (2017)
[7] ESPEHOLT L, SOYER H, MUNOS R, Et al., IMPALA: scalable distributed deep-RL with importance weighted actor-learner architectures, Proc. of the 35th International Conference on Machine Learning, pp. 1407-1416, (2018)
[8] BARRIGA N A, STANESCU M, BESOAIN F, Et al., Improving RTS game Al by supervised policy learning, tactical search, and deep reinforcement learning, IEEE Computational Intelligence Magazine, 14, 3, pp. 8-18, (2019)
[9] YE D II, LIU Z, SUN M F, Et al., Mastering complex control in MOBA games with deep reinforcement learning, Proc. of the 34th AAAI Conference on Artificial Intelligence, 34, 4, pp. 6672-6679, (2020)
[10] JADERBERG M, CZARNECKI W M, DUNNING I, Et al., Human-level performance in 3D multiplayer games with population-based reinforcement learning, Science, 364, 6443, pp. 859-865, (2019)

← 1 2 3 4 →