Efficient Use of GPU Memory for Large-Scale Deep Learning Model Training

被引：13

作者：

Choi, Hyeonseong ^{[1
]}

Lee, Jaehwan ^{[1
]}

机构：

[1] Korea Aerosp Univ, Sch Elect & Informat Engn, Goyang Si 10540, South Korea

来源：

APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 21期

基金：

新加坡国家研究基金会;

关键词：

deep learning; large-scale model; CUDA Unified Memory; PyTorch;

D O I：

10.3390/app112110377

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

To achieve high accuracy when performing deep learning, it is necessary to use a large-scale training model. However, due to the limitations of GPU memory, it is difficult to train large-scale training models within a single GPU. NVIDIA introduced a technology called CUDA Unified Memory with CUDA 6 to overcome the limitations of GPU memory by virtually combining GPU memory and CPU memory. In addition, in CUDA 8, memory advise options are introduced to efficiently utilize CUDA Unified Memory. In this work, we propose a newly optimized scheme based on CUDA Unified Memory to efficiently use GPU memory by applying different memory advise to each data type according to access patterns in deep learning training. We apply CUDA Unified Memory technology to PyTorch to see the performance of large-scale learning models through the expanded GPU memory. We conduct comprehensive experiments on how to efficiently utilize Unified Memory by applying memory advises when performing deep learning. As a result, when the data used for deep learning are divided into three types and a memory advise is applied to the data according to the access pattern, the deep learning execution time is reduced by 9.4% compared to the default Unified Memory.

引用

页数：17

共 50 条

[1] On Efficient Training of Large-Scale Deep Learning Models
Shen, Li
Sun, Yan
Yu, Zhiyuan
Ding, Liang
Tian, Xinmei
Tao, Dacheng
ACM COMPUTING SURVEYS, 2025, 57 (03)
[2] Enabling Efficient Large-Scale Deep Learning Training with Cache Coherent Disaggregated Memory Systems
Wang, Zixuan
Sim, Joonseop
Lim, Euicheol
Zhao, Jishen
2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022), 2022, : 126 - 140
[3] GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching
Guo, Cong
Zhang, Rui
Xu, Jiale
Leng, Jingwen
Liu, Zihan
Huang, Ziyu
Guo, Minyi
Wu, Hao
Zhao, Shouren
Zhao, Junping
Zhang, Ke
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, ASPLOS 2024, VOL 2, 2024, : 450 - 466
[4] Efficient Large-scale Deep Learning Framework for Heterogeneous Multi-GPU Cluster
Kim, Youngrang
Choi, Hyeonseong
Lee, Jaehwan
Kim, Jik-Soo
Jei, Hyunseung
Roh, Hongchan
2019 IEEE 4TH INTERNATIONAL WORKSHOPS ON FOUNDATIONS AND APPLICATIONS OF SELF* SYSTEMS (FAS*W 2019), 2019, : 176 - 181
[5] Efficient MPI-AllReduce for large-scale deep learning on GPU-clusters
Truong Thao Nguyen
Wahib, Mohamed
Takano, Ryousei
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (12):
[6] Training large-scale language models with limited GPU memory: a survey
Yu TANG
Linbo QIAO
Lujia YIN
Peng LIANG
Ao SHEN
Zhilin YANG
Lizhi ZHANG
Dongsheng LI
Frontiers of Information Technology & Electronic Engineering, 2025, 26 (03) : 309 - 331
[7] Training large-scale language models with limited GPU memory: a survey
Tang, Yu
Qiao, Linbo
Yin, Lujia
Liang, Peng
Shen, Ao
Yang, Zhilin
Zhang, Lizhi
Li, Dongsheng
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2025, : 309 - 331
[8] Resource-efficient Federated Learning for Large-scale Model Training
Song, Zilin
Li, Zhengze
Yuan, Tingting
Fu, Xiaoming
PROCEEDINGS OF THE WORKSHOP ON MOBILITY IN THE EVOLVING INTERNET ARCHITECTURE TO BE HELD IN CONJUNCTION WITH MOBICOM 2024, MOBIARCH 2024, 2024, : 43 - 48
[9] Large-Scale Semi-Supervised Training in Deep Learning Acoustic Model for ASR
Long, Yanhua
Li, Yijie
Wei, Shuang
Zhang, Qiaozheng
Yang, Chunxia
IEEE ACCESS, 2019, 7 : 133615 - 133627
[10] Memory-Efficient Learning for Large-Scale Computational Imaging
Kellman, Michael
Zhang, Kevin
Markley, Eric
Tamir, Jon
Bostan, Emrah
Lustig, Michael
Waller, Laura
IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2020, 6 : 1403 - 1414

← 1 2 3 4 5 →