On Machine Learning-based Stage-aware Performance Prediction of Spark Applications

被引:1
|
作者
Ye, Guangjun [1 ]
Liu, Wuji [2 ]
Wu, Chase Q. [2 ]
Shen, Wei [1 ]
Lyu, Xukang [3 ]
机构
[1] Zhejiang Sci Tech Univ, Sch Informat Sci & Technol, Hangzhou 310018, Zhejiang, Peoples R China
[2] New Jersey Inst Technol, Dept Comp Sci, Newark, NJ 07102 USA
[3] Tianjin Univ, Coll Intelligence & Comp, Sch Comp Software, Tianjin 300354, Peoples R China
基金
美国国家科学基金会;
关键词
Big data computing; performance modeling; Spark; in-memory processing;
D O I
10.1109/IPCCC50635.2020.9391564
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The data volume of large-scale applications in various science, engineering, and business domains has experienced an explosive growth over the past decade, and has gone far beyond the computing capability and storage capacity of any single server. As a viable solution, such data is oftentimes stored in distributed file systems and processed by parallel computing engines, as exemplified by Spark, which has gained increasing popularity over the traditional MapReduce framework due to its fast in-memory processing of streaming data. Spark engines are generally deployed in cloud environments such as Amazon EC2 and Alibaba Cloud. However, storage and computing resources in these cloud environments are typically provisioned on a pay-as-you-go basis and thus an accurate estimate of the execution time of Spark workloads is critical to making full utilization of cloud resources and meeting performance requirements of end users. Our insight is that the execution pattern of many Spark workloads is qualitatively similar, which makes it possible to leverage historical performance data to predict the execution time of a given Spark application. We use the execution information extracted from Spark History Server as training data and develop a stage-aware hierarchical neural network model for performance prediction. Experimental results show that the proposed hierarchical model achieves higher accuracy than a holistic prediction model at the end-to-end level, and also outperforms other existing regression-based prediction methods.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Performance Optimization of Machine Learning Algorithms Based on Spark
    Luo W.
    Zhang S.
    Xu Y.
    Appl. Math. Nonlinear Sci., 2024, 1
  • [32] Hadoop–Spark Framework for Machine Learning-Based Smart Irrigation Planning
    Asmae El Mezouari
    Abdelaziz El Fazziki
    Mohammed Sadgal
    SN Computer Science, 2022, 3 (1)
  • [33] Early Stage Machine Learning-Based Prediction of US County Vulnerability to the COVID-19 Pandemic: Machine Learning Approach
    Mehta, Mihir
    Julaiti, Juxihong
    Griffin, Paul
    Kumara, Soundar
    JMIR PUBLIC HEALTH AND SURVEILLANCE, 2020, 6 (03): : 377 - 387
  • [34] QoE-Aware Edge-Assisted Machine Learning-Based Fall Detection and Prediction With FBGs
    Rocha, Matilde
    Chi, Hao Ran
    Alberto, Nelia
    Andre, Paulo
    Antunes, Paulo
    Radwan, Ayman
    Fatima Domingues, M.
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 852 - 857
  • [35] Machine learning-based QoS and traffic-aware prediction-assisted dynamic network slicing
    Kumar, Naveen
    Ahmad, Anwar
    INTERNATIONAL JOURNAL OF COMMUNICATION NETWORKS AND DISTRIBUTED SYSTEMS, 2022, 28 (01) : 27 - 42
  • [36] Machine learning-based epoxy resin property prediction
    Jang, Huiwon
    Ryu, Dayoung
    Lee, Wonseok
    Park, Geunyeong
    Kim, Jihan
    MOLECULAR SYSTEMS DESIGN & ENGINEERING, 2024, 9 (09): : 959 - 968
  • [37] Machine Learning-Based Approach for Hardware Faults Prediction
    Khalil, Kasem
    Eldash, Omar
    Kumar, Ashok
    Bayoumi, Magdy
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2020, 67 (11) : 3880 - 3892
  • [38] Machine learning-based prediction of compound profiling matrices
    Perez, Raquel Rodriguez
    Bajorath, Jurgen
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 257
  • [39] Machine learning-based weather prediction with radiosonde observations
    Gogen, Eralp
    Guney, Selda
    JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, 2024, 39 (04): : 2317 - 2328
  • [40] Machine Learning-Based Academic Result Prediction System
    Bhushan, Megha
    Verma, Utkarsh
    Garg, Chetna
    Negi, Arun
    INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2024, 12 (01)