A lightweight performance proxy for deep-learning model training on Amazon SageMaker

被引：0

作者：

Tesser, Rafael Keller ^{[1
,2
,3
]}

Marques, Alvaro ^{[2
]}

Borin, Edson ^{[2
]}

机构：

[1] Univ Campinas Unicamp, Ctr Comp Engn & Sci, Sao Paulo, Brazil

[2] Univ Campinas Unicamp, Inst Comp, Sao Paulo, Brazil

[3] Fed Univ Technol Parana UTFPR, Bachelors Course Comp Sci, Santa Helena, PR, Brazil

来源：

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2024年 / 36卷 / 14期

关键词：

cloud computing; cost prediction; deep learning; machine learning; performance prediction;

D O I：

10.1002/cpe.8104

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Cloud computing has become popular for training deep-learning (DL) models, avoiding the costs of acquiring and maintaining on-premise systems. SageMaker is a cloud service that automates the execution of DL workloads. Its features include automatic hyperparameter optimization and use of spot instances. Nonetheless, it does not assist in selecting the right instance type for a workload. In public clouds, rent price depends on the configuration of the chosen instance type. Advanced and faster instances are typically more expensive, but not always the best choice. To select the optimal instance type, users must compare the workload's relative performance (and hence cost) on several candidates. Building on the execution profiles of multiple DL applications, we model the performance and cost of training DL applications on SageMaker and propose a lightweight technique to estimate these at low temporal and monetary cost. This method is a performance proxy that can be used to replace more expensive performance measurement procedures. So, it could speed up any technique that relies on such measurements. We show how it can help cloud customers seeking suitable instance types to train DL models, and that it can accurately predict the performance of different instance types when training these models on SageMaker.

引用

页数：22

共 50 条

[41] Deep-Learning Inferencing with High-Performance Hardware Accelerators
Kljucaric, Luke
George, Alan D.
2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
[42] Performance of Various Deep-Learning Networks in the Seed Classification Problem
Eryigit, Recep
Tugrul, Bulent
SYMMETRY-BASEL, 2021, 13 (10):
[43] Design of synthetic promoters for cyanobacteria with generative deep-learning model
Seo, Euijin
Choi, Yun-Nam
Shin, Ye Rim
Kim, Donghyuk
Lee, Jeong Wook
NUCLEIC ACIDS RESEARCH, 2023, 51 (13) : 7071 - 7082
[44] The Establishment and Evaluation Model of the Thematic Deep-Learning Teaching Module
Yao, Kai-Chao
Hsu, Li-Chiou
Fang, Jiunn-Shiou
Chen, Yi-Jung
Guo, Zhou-Kai
APPLIED SCIENCES-BASEL, 2025, 15 (05):
[45] SeisT: A Foundational Deep-Learning Model for Earthquake Monitoring Tasks
Li, Sen
Yang, Xu
Cao, Anye
Wang, Changbin
Liu, Yaoqi
Liu, Yapeng
Niu, Qiang
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
[46] Causal Attention Deep-learning Model for Solar Flare Forecasting
Zhang, Xinze
Xu, Long
Li, Zihan
Huang, Xin
ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 2024, 274 (02):
[47] A Deep-Learning Based Model for Emotional Evaluation of Video Clips
Kim, Byoungjun
Lee, Joonwhoan
INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2018, 18 (04) : 245 - 253
[48] A DEEP-LEARNING MODEL FOR IDIOPATHIC OSTEOSCLEROSIS DETECTION ON PANORAMIC RADIOGRAPHS
Yesiltepe, Selin
Bayrakdar, Ibrahim Sevki
Orhan, Kaan
Celik, Ozer
Bilgir, Elif
Aslan, Ahmet Faruk
Odaba, Alper
Costa, Andre Luiz Ferreira
Jagtap, Rohan
MEDICAL PRINCIPLES AND PRACTICE, 2022, 31 (06) : 555 - 561
[49] Product Choice with Large Assortments: A Scalable Deep-Learning Model
Gabel, Sebastian
Timoshenko, Artem
MANAGEMENT SCIENCE, 2022, 68 (03) : 1808 - 1827
[50] A deep-learning model to assist thyroid nodule diagnosis and management
Lee, Joon-Hyop
Chai, Young Jun
LANCET DIGITAL HEALTH, 2021, 3 (07): : e409 - e409

← 1 2 3 4 5 →