On Machine Learning-based Stage-aware Performance Prediction of Spark Applications

被引:1
|
作者
Ye, Guangjun [1 ]
Liu, Wuji [2 ]
Wu, Chase Q. [2 ]
Shen, Wei [1 ]
Lyu, Xukang [3 ]
机构
[1] Zhejiang Sci Tech Univ, Sch Informat Sci & Technol, Hangzhou 310018, Zhejiang, Peoples R China
[2] New Jersey Inst Technol, Dept Comp Sci, Newark, NJ 07102 USA
[3] Tianjin Univ, Coll Intelligence & Comp, Sch Comp Software, Tianjin 300354, Peoples R China
基金
美国国家科学基金会;
关键词
Big data computing; performance modeling; Spark; in-memory processing;
D O I
10.1109/IPCCC50635.2020.9391564
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The data volume of large-scale applications in various science, engineering, and business domains has experienced an explosive growth over the past decade, and has gone far beyond the computing capability and storage capacity of any single server. As a viable solution, such data is oftentimes stored in distributed file systems and processed by parallel computing engines, as exemplified by Spark, which has gained increasing popularity over the traditional MapReduce framework due to its fast in-memory processing of streaming data. Spark engines are generally deployed in cloud environments such as Amazon EC2 and Alibaba Cloud. However, storage and computing resources in these cloud environments are typically provisioned on a pay-as-you-go basis and thus an accurate estimate of the execution time of Spark workloads is critical to making full utilization of cloud resources and meeting performance requirements of end users. Our insight is that the execution pattern of many Spark workloads is qualitatively similar, which makes it possible to leverage historical performance data to predict the execution time of a given Spark application. We use the execution information extracted from Spark History Server as training data and develop a stage-aware hierarchical neural network model for performance prediction. Experimental results show that the proposed hierarchical model achieves higher accuracy than a holistic prediction model at the end-to-end level, and also outperforms other existing regression-based prediction methods.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] A Physics-Aware Machine Learning-Based Framework for Minimizing Prediction Uncertainty of Hydrological Models
    Roy, Abhinanda
    Kasiviswanathan, K. S.
    Patidar, Sandhya
    Adeloye, Adebayo J. J.
    Soundharajan, Bankaru-Swamy
    Ojha, Chandra Shekhar P.
    WATER RESOURCES RESEARCH, 2023, 59 (06)
  • [22] Adaptive Machine Learning-based Temperature Prediction Scheme for Thermal-aware NoC System
    Chen, Kun-Chih
    Liao, Yuan-Hao
    2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [23] Machine Learning-Based Prediction of a BOS Reactor Performance from Operating Parameters
    Rahnama, Alireza
    Li, Zushu
    Sridhar, Seetharaman
    PROCESSES, 2020, 8 (03)
  • [24] Machine learning-based prediction of heat transport performance in oscillating heat pipe
    Koyama, Ryo
    Inokuma, Kento
    Murata, Akira
    Iwamoto, Kaoru
    Saito, Hiroshi
    JOURNAL OF THERMAL SCIENCE AND TECHNOLOGY, 2022, 17 (01)
  • [25] Machine learning-based performance prediction for ground source heat pump systems
    Zhang, Xueyou
    Wang, Enyu
    Liu, Liansheng
    Qi, Chengying
    GEOTHERMICS, 2022, 105
  • [26] Performance tuning for machine learning-based software development effort prediction models
    Ertugrul, Egemen
    Baytar, Zakir
    Catal, Cagatay
    Muratli, Can
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2019, 27 (02) : 1308 - 1324
  • [27] Performance Improvements of Machine Learning-Based Crime Prediction, A Case Study in Bangladesh
    Nobel, S. M. Nuruzzaman
    Swapno, S. M. Masfequier Rahman
    Islam, Md Babul
    Meena, V. P.
    Benedetto, Francesco
    2024 IEEE 3RD INTERNATIONAL CONFERENCE ON COMPUTING AND MACHINE INTELLIGENCE, ICMI 2024, 2024,
  • [28] Novel machine learning-based EOIR sensor performance modeling for naval applications
    Crow, Brandon J.
    Espinola, Richard L.
    Owens, Saba
    Wilson, Rebecca
    INFRARED IMAGING SYSTEMS: DESIGN, ANALYSIS, MODELING, AND TESTING XXXIV, 2023, 12533
  • [29] Machine Learning and Deep Learning-Based Students’ Grade Prediction
    Korchi A.
    Messaoudi F.
    Abatal A.
    Manzali Y.
    Operations Research Forum, 4 (4)
  • [30] Machine Learning-Based Software Defect Prediction for Mobile Applications: A Systematic Literature Review
    Jorayeva, Manzura
    Akbulut, Akhan
    Catal, Cagatay
    Mishra, Alok
    SENSORS, 2022, 22 (07)