DeepSpotCloud: Leveraging Cross-Region GPU Spot Instances for Deep Learning

被引:27
|
作者
Lee, Kyungyong [1 ]
Son, Myungjun [1 ]
机构
[1] Kookmin Univ, Dept Comp Sci, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
10.1109/CLOUD.2017.21
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud computing resources that are equipped with GPU devices are widely used for applications that require extensive parallelism, such as deep learning. When the demand of cloud computing instance is low, the surplus of resources is provided at a lower price in the form of spot instance by AWS EC2. This paper proposes DeepSpotCloud that utilizes GPU-equipped spot instances to run deep learning tasks in a cost efficient and fault-tolerant way. Thorough analysis about spot instance price history logs reveals that GPU spot instances show more dynamic price change pattern than other general types of cloud computing resources. To deal with the price dynamicity of the GPU spot instance, DeepSpotCloud utilizes instances in different regions across continents as a single resource pool. This paper also proposes a task migration heuristic by utilizing a checkpointing mechanism of existing deep learning analysis platform to conduct fast task migration when a running spot instance is interrupted. Extensive experiments using real AWS services prove that the proposed task migration method is effective even in a WAN environment with limited network bandwidth. Comprehensive simulations by replaying AWS EC2 price history logs reveal that DeepSpotCloud can achieve 13% more cost gain than a state-of-the-art interrupt-driven scheduling policy. The prototype of DeepSpotCloud is implemented using various cloud computing services provided by AWS to serve real deep learning tasks.
引用
收藏
页码:98 / 105
页数:8
相关论文
共 16 条
  • [1] Deep learning for cross-region streamflow and flood forecasting at a global scale
    Zhang, Binlan
    Ouyang, Chaojun
    Cui, Peng
    Xu, Qingsong
    Wang, Dongpo
    Zhang, Fei
    Li, Zhong
    Fan, Linfeng
    Lovati, Marco
    Liu, Yanling
    Zhang, Qianqian
    INNOVATION, 2024, 5 (03):
  • [2] ML Training with Cloud GPU Shortages: Is Cross-Region the Answer?
    Strati, Foteini
    Elvinger, Paul
    Kerimoglu, Tolga
    Klimovic, Ana
    PROCEEDINGS OF THE 2024 4TH WORKSHOP ON MACHINE LEARNING AND SYSTEMS, EUROMLSYS 2024, 2024, : 107 - 116
  • [3] Leveraging deep learning for dollar spot detection and quantification in turfgrass
    Kitchin, Elisabeth C. A.
    Sneed, Henry J.
    Mccall, David S.
    CROP SCIENCE, 2025, 65 (01)
  • [4] A cross-region transfer learning method for classification of community service cases with small datasets
    Liu, Zhao-ge
    Li, Xiang-yang
    Qiao, Li-min
    Durrani, Dilawar Khan
    KNOWLEDGE-BASED SYSTEMS, 2020, 193
  • [5] Effective Cross-Region Courier-Displacement for Instant Delivery via Reinforcement Learning
    Hu, Shijie
    Guo, Baoshen
    Wang, Shuai
    Zhou, Xiaolei
    WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, WASA 2021, PT I, 2021, 12937 : 288 - 300
  • [6] Cross-Region Courier Displacement for On-Demand Delivery With Multi-Agent Reinforcement Learning
    Wang, Shuai
    Hu, Shijie
    Guo, Baoshen
    Wang, Guang
    IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (05) : 1321 - 1333
  • [7] A MultiKernel Domain Adaptation Method for Unsupervised Transfer Learning on Cross-Source and Cross-Region Remote Sensing Data Classification
    Liu, Wei
    Qin, Rongjun
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (06): : 4279 - 4289
  • [8] Cross Border Data Flow Governance in Storage Cloud Leveraging Deep Learning Techniques
    Gangopadhyay, Briti
    Jetla, Vishal
    Patil, Sandeep R.
    Pancha, Huzefa
    Gildea, Kevin
    Zetie, Carl
    2018 SEVENTH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING IN EMERGING MARKETS (CCEM), 2018, : 17 - 22
  • [9] Leveraging Deep Learning Models for Cross-function Null Pointer Risks Detection
    Ding, Yue
    Wu, Qian
    Li, Yinzhu
    Wang, Dongdong
    Huang, Jiaxin
    2023 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING, AITEST, 2023, : 107 - 113
  • [10] Leveraging deep learning and computer vision technologies to enhance management of coastal fisheries in the Pacific region
    Shedrawi, George
    Magron, Franck
    Vigga, Bernard
    Bosserelle, Pauline
    Gislard, Sebastien
    Halford, Andrew R.
    Tiitii, Sapeti
    Fepuleai, Faasulu
    Molai, Chris
    Rota, Manibua
    Jalam, Shivam
    Fatongiatau, Viliami
    Sami, Abel P.
    Nikiari, Beia
    Sokach, Ada H. M.
    Joy, Lucy A.
    Li, Owen
    Steenbergen, Dirk J.
    Andrew, Neil L.
    SCIENTIFIC REPORTS, 2024, 14 (01):