Efficient Use of GPU Memory for Large-Scale Deep Learning Model Training

被引:13
|
作者
Choi, Hyeonseong [1 ]
Lee, Jaehwan [1 ]
机构
[1] Korea Aerosp Univ, Sch Elect & Informat Engn, Goyang Si 10540, South Korea
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 21期
基金
新加坡国家研究基金会;
关键词
deep learning; large-scale model; CUDA Unified Memory; PyTorch;
D O I
10.3390/app112110377
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
To achieve high accuracy when performing deep learning, it is necessary to use a large-scale training model. However, due to the limitations of GPU memory, it is difficult to train large-scale training models within a single GPU. NVIDIA introduced a technology called CUDA Unified Memory with CUDA 6 to overcome the limitations of GPU memory by virtually combining GPU memory and CPU memory. In addition, in CUDA 8, memory advise options are introduced to efficiently utilize CUDA Unified Memory. In this work, we propose a newly optimized scheme based on CUDA Unified Memory to efficiently use GPU memory by applying different memory advise to each data type according to access patterns in deep learning training. We apply CUDA Unified Memory technology to PyTorch to see the performance of large-scale learning models through the expanded GPU memory. We conduct comprehensive experiments on how to efficiently utilize Unified Memory by applying memory advises when performing deep learning. As a result, when the data used for deep learning are divided into three types and a memory advise is applied to the data according to the access pattern, the deep learning execution time is reduced by 9.4% compared to the default Unified Memory.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Large-Scale Pairwise Sequence Alignments on a Large-Scale GPU Cluster
    Savran, Ibrahim
    Gao, Yang
    Bakos, Jason D.
    IEEE DESIGN & TEST, 2014, 31 (01) : 51 - 61
  • [42] Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training
    Li, Shenggui
    Liu, Hongxin
    Bian, Zhengda
    Fang, Jiarui
    Huang, Haichen
    Liu, Yuliang
    Wang, Boxiang
    You, Yang
    PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023, 2023, : 766 - 775
  • [43] Hybrid Electrical/Optical Switch Architectures for Training Distributed Deep Learning in Large-Scale
    Thao-Nguyen Truong
    Takano, Ryousei
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (08) : 1332 - 1339
  • [44] Efficient Multi-GPU Memory Management for Deep Learning Acceleration
    Kim, Youngrang
    Lee, Jaehwan
    Kim, Jik-Soo
    Jei, Hyunseung
    Roh, Hongchan
    2018 IEEE 3RD INTERNATIONAL WORKSHOPS ON FOUNDATIONS AND APPLICATIONS OF SELF* SYSTEMS (FAS*W), 2018, : 37 - 43
  • [45] Cost Efficient GPU Cluster Management for Training and Inference of Deep Learning
    Kang, Dong-Ki
    Lee, Ki-Beom
    Kim, Young-Chon
    ENERGIES, 2022, 15 (02)
  • [46] Crux: GPU-Efficient Communication Scheduling for Deep Learning Training
    Cao, Jiamin
    Guan, Yu
    Qian, Kun
    Gao, Jiaqi
    Xiao, Wencong
    Dong, Jianbo
    Fu, Binzhang
    Cai, Dennis
    Zhai, Ennan
    PROCEEDINGS OF THE 2024 ACM SIGCOMM 2024 CONFERENCE, ACM SIGCOMM 2024, 2024, : 1 - 15
  • [47] Drones and deep learning produce accurate and efficient monitoring of large-scale seabird colonies
    Hayes, Madeline C.
    Gray, Patrick C.
    Harris, Guillermo
    Sedgwick, Wade C.
    Crawford, Vivon D.
    Chazal, Natalie
    Crofts, Sarah
    Johnston, David W.
    ORNITHOLOGICAL APPLICATIONS, 2021, 123 (03)
  • [48] EFFICIENT LARGE-SCALE DAMAGE ASSESSMENT AFTER NATURAL DISASTERS WITH UAVS AND DEEP LEARNING
    Rahnemoonfar, Maryam
    Safavi, Farshad
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 1668 - 1671
  • [49] Efficient large-scale cell classification and analysis for MultiOmyx™assays: A deep learning approach
    Nagy, Mate L.
    Hanifi, Arezoo
    Tirupsur, Ahalya
    Wong, Geoffrey
    Fang, Jun
    Hoe, Nicholas
    Au, Qingyan
    Padmanabhan, Raghav K.
    CANCER RESEARCH, 2018, 78 (13)
  • [50] Efficient Pump Scheduling for Large-Scale Multiproduct Pipelines Using Deep Reinforcement Learning
    Shao, Kai
    Wang, Xinmin
    Liu, Min
    Xu, Aobo
    Jian, Ling
    INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2024,