Performance Modeling of MapReduce Jobs in Heterogeneous Cloud Environments

被引:45
|
作者
Zhang, Zhuoyao [1 ]
Cherkasova, Ludmila [2 ]
Boon Thau Loo [1 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
[2] Hewlett Packard Labs, Palo Alto, CA USA
关键词
MapReduce; heterogeneous clusters; performance modeling; efficiency;
D O I
10.1109/CLOUD.2013.107
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many companies start using Hadoop for advanced data analytics over large datasets. While a traditional Hadoop cluster deployment assumes a homogeneous cluster, many enterprise clusters are grown incrementally over time, and might have a variety of different servers in the cluster. The nodes' heterogeneity represents an additional challenge for efficient cluster and job management. Due to resource heterogeneity, it is often unclear which resources introduce inefficiency and bottlenecks, and how such a Hadoop cluster should be configured and optimized. In this work(1), we explore the efficiency and performance accuracy of the bounds-based performance model for predicting the MapReduce job completion times in heterogeneous Hadoop clusters. We validate the accuracy of the proposed performance model using a diverse set of 13 realistic applications and two different heterogeneous clusters. Since one of the Hadoop clusters is formed by different capacity VM instances in Amazon EC2 environment, we additionally explore and discuss factors that impact the MapReduce job performance in the Cloud.
引用
收藏
页码:839 / 846
页数:8
相关论文
共 50 条
  • [41] Joint scheduling of MapReduce jobs with servers: Performance bounds and experiments
    Ling, Xiao
    Yuan, Yi
    Wang, Dan
    Liu, Jiangchuan
    Yang, Jiahai
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2016, 90-91 : 52 - 66
  • [42] Performance Prediction Model in Heterogeneous MapReduce Environment
    Fan, Yuanquan
    Wu, Weiguo
    Xu, Yunlong
    Cao, Yangjie
    Li, Qian
    Cui, Jinhua
    Duan, Zhangfeng
    2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (CIT), 2014, : 240 - 245
  • [43] Separation of Concerns in Heterogeneous Cloud Environments
    Dong, Dapeng
    Xiong, Huanhuan
    Morrison, John P.
    CLOSER: PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, 2017, : 747 - 752
  • [44] Checkpointing as a Service in Heterogeneous Cloud Environments
    Cao, Jiajun
    Simonin, Matthieu
    Cooperman, Gene
    Morin, Christine
    2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 61 - 70
  • [45] PERFORMANCE MODELING AND OPTIMIZATION OF MAPREDUCE PROGRAMS
    Yin, Jinsong
    Qiao, Yuanyuan
    2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2014, : 180 - 186
  • [46] MapReduce optimization algorithm based on machine learning in heterogeneous cloud environment
    LIN Wen-hui
    LEI Zhen-ming
    LIU Jun
    YANG Jie
    LIU Fang
    HE Gang
    WANG Qin
    The Journal of China Universities of Posts and Telecommunications, 2013, (06) : 77 - 87
  • [47] A New Approach to the Cloud-Based Heterogeneous MapReduce Placement Problem
    Xu, Xiaoyong
    Tang, Maolin
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2016, 9 (06) : 862 - 871
  • [48] MapReduce optimization algorithm based on machine learning in heterogeneous cloud environment
    LIN Wenhui
    LEI Zhenming
    LIU Jun
    YANG Jie
    LIU Fang
    HE Gang
    WANG Qin
    TheJournalofChinaUniversitiesofPostsandTelecommunications, 2013, 20 (06) : 77 - 87+121
  • [49] Allocating MapReduce workflows with deadlines to heterogeneous servers in a cloud data center
    Wang, Jia
    Li, Xiaoping
    Ruiz, Ruben
    Xu, Hanchuan
    Chu, Dianhui
    SERVICE ORIENTED COMPUTING AND APPLICATIONS, 2020, 14 (02) : 101 - 118
  • [50] Kahuna: Problem Diagnosis for MapReduce-Based Cloud Computing Environments
    Tan, Jiaqi
    Pan, Xinghao
    Marinelli, Eugene
    Kavulya, Soila
    Gandhi, Rajeev
    Narasimhan, Priya
    PROCEEDINGS OF THE 2010 IEEE-IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, 2010, : 112 - 119