Performance Modeling of MapReduce Jobs in Heterogeneous Cloud Environments

被引:45
|
作者
Zhang, Zhuoyao [1 ]
Cherkasova, Ludmila [2 ]
Boon Thau Loo [1 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
[2] Hewlett Packard Labs, Palo Alto, CA USA
关键词
MapReduce; heterogeneous clusters; performance modeling; efficiency;
D O I
10.1109/CLOUD.2013.107
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many companies start using Hadoop for advanced data analytics over large datasets. While a traditional Hadoop cluster deployment assumes a homogeneous cluster, many enterprise clusters are grown incrementally over time, and might have a variety of different servers in the cluster. The nodes' heterogeneity represents an additional challenge for efficient cluster and job management. Due to resource heterogeneity, it is often unclear which resources introduce inefficiency and bottlenecks, and how such a Hadoop cluster should be configured and optimized. In this work(1), we explore the efficiency and performance accuracy of the bounds-based performance model for predicting the MapReduce job completion times in heterogeneous Hadoop clusters. We validate the accuracy of the proposed performance model using a diverse set of 13 realistic applications and two different heterogeneous clusters. Since one of the Hadoop clusters is formed by different capacity VM instances in Amazon EC2 environment, we additionally explore and discuss factors that impact the MapReduce job performance in the Cloud.
引用
收藏
页码:839 / 846
页数:8
相关论文
共 50 条
  • [1] MrHeter: improving MapReduce performance in heterogeneous environments
    Zhang, Xiao
    Wu, Yanjun
    Zhao, Chen
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2016, 19 (04): : 1691 - 1701
  • [2] MrHeter: improving MapReduce performance in heterogeneous environments
    Xiao Zhang
    Yanjun Wu
    Chen Zhao
    Cluster Computing, 2016, 19 : 1691 - 1701
  • [3] Enhancing Performance of MapReduce Framework in Heterogeneous Environments
    Naik, Nenavath Srinivas
    Negi, Atul
    Sastry, V. N.
    2015 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS (ADCOM), 2015, : 51 - 54
  • [4] A Combined Analytical Modeling Machine Learning Approach for Performance Prediction of MapReduce Jobs in Cloud Environment
    Ataie, Ehsan
    Gianniti, Eugenio
    Ardagna, Danilo
    Movaghar, Ali
    PROCEEDINGS OF 2016 18TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC), 2016, : 431 - 439
  • [5] MapReduce++ - Efficient processing of MapReduce jobs in the cloud
    Zhang, Guigang
    Li, Chao
    Zhang, Yong
    Xing, Chunxiao
    Yang, Jijiang
    Journal of Computational Information Systems, 2012, 8 (14): : 5757 - 5764
  • [6] MapReduce Scheduling for Deadline-Constrained Jobs in Heterogeneous Cloud Computing Systems
    Chen, Chien-Hung
    Lin, Jenn-Wei
    Kuo, Sy-Yen
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2018, 6 (01) : 127 - 140
  • [7] Achieving Elasticity for Cloud MapReduce Jobs
    Salah, Khaled
    Calero, Jose M. Alcaraz
    PROCEEDINGS OF THE 2013 IEEE 2ND INTERNATIONAL CONFERENCE ON CLOUD NETWORKING (CLOUDNET), 2013, : 195 - 199
  • [8] Improving MapReduce Performance in a Heterogeneous Cloud: A Measurement Study
    Zhao, Xu
    Liu, Ling
    Zhang, Qi
    Dong, Xiaoshe
    2014 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2014, : 401 - 408
  • [9] Performance analysis of MapReduce program in heterogeneous cloud computing
    Lin, Wenhui
    Liu, Jun
    Journal of Networks, 2013, 8 (08) : 1734 - 1741
  • [10] Improving MapReduce Performance in Heterogeneous Environments with Adaptive Task Tuning
    Cheng, Dazhao
    Rao, Jia
    Guo, Yanfei
    Zhou, Xiaobo
    ACM/IFIP/USENIX MIDDLEWARE 2014, 2014, : 97 - 108