A lightweight performance proxy for deep-learning model training on Amazon SageMaker

被引:0
|
作者
Tesser, Rafael Keller [1 ,2 ,3 ]
Marques, Alvaro [2 ]
Borin, Edson [2 ]
机构
[1] Univ Campinas Unicamp, Ctr Comp Engn & Sci, Sao Paulo, Brazil
[2] Univ Campinas Unicamp, Inst Comp, Sao Paulo, Brazil
[3] Fed Univ Technol Parana UTFPR, Bachelors Course Comp Sci, Santa Helena, PR, Brazil
来源
关键词
cloud computing; cost prediction; deep learning; machine learning; performance prediction;
D O I
10.1002/cpe.8104
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Cloud computing has become popular for training deep-learning (DL) models, avoiding the costs of acquiring and maintaining on-premise systems. SageMaker is a cloud service that automates the execution of DL workloads. Its features include automatic hyperparameter optimization and use of spot instances. Nonetheless, it does not assist in selecting the right instance type for a workload. In public clouds, rent price depends on the configuration of the chosen instance type. Advanced and faster instances are typically more expensive, but not always the best choice. To select the optimal instance type, users must compare the workload's relative performance (and hence cost) on several candidates. Building on the execution profiles of multiple DL applications, we model the performance and cost of training DL applications on SageMaker and propose a lightweight technique to estimate these at low temporal and monetary cost. This method is a performance proxy that can be used to replace more expensive performance measurement procedures. So, it could speed up any technique that relies on such measurements. We show how it can help cloud customers seeking suitable instance types to train DL models, and that it can accurately predict the performance of different instance types when training these models on SageMaker.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] Deep-Learning Inferencing with High-Performance Hardware Accelerators
    Kljucaric, Luke
    George, Alan D.
    2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
  • [42] Performance of Various Deep-Learning Networks in the Seed Classification Problem
    Eryigit, Recep
    Tugrul, Bulent
    SYMMETRY-BASEL, 2021, 13 (10):
  • [43] Design of synthetic promoters for cyanobacteria with generative deep-learning model
    Seo, Euijin
    Choi, Yun-Nam
    Shin, Ye Rim
    Kim, Donghyuk
    Lee, Jeong Wook
    NUCLEIC ACIDS RESEARCH, 2023, 51 (13) : 7071 - 7082
  • [44] The Establishment and Evaluation Model of the Thematic Deep-Learning Teaching Module
    Yao, Kai-Chao
    Hsu, Li-Chiou
    Fang, Jiunn-Shiou
    Chen, Yi-Jung
    Guo, Zhou-Kai
    APPLIED SCIENCES-BASEL, 2025, 15 (05):
  • [45] SeisT: A Foundational Deep-Learning Model for Earthquake Monitoring Tasks
    Li, Sen
    Yang, Xu
    Cao, Anye
    Wang, Changbin
    Liu, Yaoqi
    Liu, Yapeng
    Niu, Qiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [46] Causal Attention Deep-learning Model for Solar Flare Forecasting
    Zhang, Xinze
    Xu, Long
    Li, Zihan
    Huang, Xin
    ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 2024, 274 (02):
  • [47] A Deep-Learning Based Model for Emotional Evaluation of Video Clips
    Kim, Byoungjun
    Lee, Joonwhoan
    INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2018, 18 (04) : 245 - 253
  • [48] A DEEP-LEARNING MODEL FOR IDIOPATHIC OSTEOSCLEROSIS DETECTION ON PANORAMIC RADIOGRAPHS
    Yesiltepe, Selin
    Bayrakdar, Ibrahim Sevki
    Orhan, Kaan
    Celik, Ozer
    Bilgir, Elif
    Aslan, Ahmet Faruk
    Odaba, Alper
    Costa, Andre Luiz Ferreira
    Jagtap, Rohan
    MEDICAL PRINCIPLES AND PRACTICE, 2022, 31 (06) : 555 - 561
  • [49] Product Choice with Large Assortments: A Scalable Deep-Learning Model
    Gabel, Sebastian
    Timoshenko, Artem
    MANAGEMENT SCIENCE, 2022, 68 (03) : 1808 - 1827
  • [50] A deep-learning model to assist thyroid nodule diagnosis and management
    Lee, Joon-Hyop
    Chai, Young Jun
    LANCET DIGITAL HEALTH, 2021, 3 (07): : e409 - e409