High efficient training method of MiniGo on large-scale heterogeneous computing platform

被引：0

作者：

Li, Rongchun ^{[1
]}

He, Zhouyu ^{[1
]}

Qiao, Peng ^{[1
]}

Jiang, Jingfei ^{[1
]}

Dou, Yong ^{[1
]}

Li, Dongsheng ^{[1
]}

机构：

[1] National Key Laboratory of Parallel and Distributed Computing, National University of Defense Technology, Changsha,410073, China

来源：

Guofang Keji Daxue Xuebao/Journal of National University of Defense Technology | 2024年 / 46卷 / 05期

关键词：

An efficient multi-level parallel training method suitable for training MiniGo agents on large-scale heterogeneous computing platforms was proposed; including task level parallelism between nodes; CPU-DSP (central processing unit-digital signal process) heterogeneous parallelism and DSP core parallelism. Efficient input/output deployment and eliminated the bottleneck of network communication were realized. A heterogeneous computing memory management oriented to CPU-DSP shared memory structure was proposed to reduce the data handling between heterogeneous devices. Shared memory programming optimization was realized; and the dense convolution calculation operator acceleration optimization was realized by DSP. Results show that compared with 16 core CPU calculation; the maximum acceleration ratio of single core DSP operator acceleration is 16. 44. In this method; the scale of computing nodes is expanded from 1 067 to 4 139; the time required to reach the given termination condition is reduced from 43. 02 h to 16. 05 h; and the expansion efficiency is 69. 1%. Evaluation shows that this method can realize the efficient parallel training of MiniGo on large-scale heterogeneous computing platforms. © 2024 National University of Defense Technology. All rights reserved;

D O I：

10.11887/j.cn.202405022

中图分类号：

学科分类号：

摘要：

引用

页码：209 / 218

共 50 条

[41] Performance optimization of heterogeneous computing for large-scale dynamic graph data
Wang, Haifeng
Guo, Wenkang
Zhang, Ming
JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
[42] Efficient large-scale heterogeneous debugging using dynamic tracing
Nadeau, Didier
Ezzati-Jivan, Naser
Dagenais, Michel R.
JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 98 : 346 - 360
[43] AN EFFICIENT MULTISCALE PRECONDITIONER FOR LARGE-SCALE HIGHLY HETEROGENEOUS FLOW
Fu, Shubin
Chung, Eric
Zhao, Lina
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2024, 46 (02): : S352 - S377
[44] A Heterogeneous Platform with GPU and FPGA for Power Efficient High Performance Computing
Wu, Qiang
Ha, Yajun
Kumar, Akash
Luo, Shaobo
Li, Ang
Mohamed, Shihab
2014 14TH INTERNATIONAL SYMPOSIUM ON INTEGRATED CIRCUITS (ISIC), 2014, : 220 - 223
[45] Copernicus, a hybrid dataflow and peer-to-peer scientific computing platform for efficient large-scale ensemble sampling
Pouya, Iman
Pronk, Sander
Lundborg, Magnus
Lindahl, Erik
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 71 : 18 - 31
[46] Large-scale informatics platform and high-performance computing at the Feinstein Institute for Medical Research Biorepository
Lundsten, Robert
Gregersen, Peter K.
CELL PRESERVATION TECHNOLOGY, 2006, 4 (03): : 222 - 223
[47] Large-scale simulation platform
Institute of Cybernetics, Tallinn Technical University, Akadeemia tee 21, 12618 Tallinn, Estonia
WSEAS Trans. Comput., 2007, 1 (65-71):
[48] Efficient Interactive Training Selection for Large-Scale Entity Resolution
Wang, Qing
Vatsalan, Dinusha
Christen, Peter
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART II, 2015, 9078 : 562 - 573
[49] An efficient implementation of 3D high-resolution imaging for large-scale seismic data with GPU/CPU heterogeneous parallel computing
Xu, Jincheng
Liu, Wei
Wang, Jin
Liu, Linong
Zhang, Jianfeng
COMPUTERS & GEOSCIENCES, 2018, 111 : 272 - 282
[50] Computing the Schulze Method for Large-Scale Preference Data Sets
Csar, Theresa
Lackner, Martin
Pichler, Reinhard
PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 180 - 187

← 1 2 3 4 5 →