Task failure resilience technique for improving the performance of MapReduce in Hadoop

被引:10
|
作者
Kavitha, C. [1 ]
Anita, X. [2 ]
机构
[1] Anna Univ, Dept Informat & Commun Engn, Chennai, Tamil Nadu, India
[2] Jerusalem Coll Engn, Dept Comp Sci & Engn, Chennai, Tamil Nadu, India
关键词
Hadoop; in-memory; key-value pair; MapReduce; recovery; Redis cache; resilience; task failure;
D O I
10.4218/etrij.2018-0265
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
MapReduce is a framework that can process huge datasets in parallel and distributed computing environments. However, a single machine failure during the runtime of MapReduce tasks can increase completion time by 50%. MapReduce handles task failures by restarting the failed task and re-computing all input data from scratch, regardless of how much data had already been processed. To solve this issue, we need the computed key-value pairs to persist in a storage system to avoid re-computing them during the restarting process. In this paper, the task failure resilience (TFR) technique is proposed, which allows the execution of a failed task to continue from the point it was interrupted without having to redo all the work. Amazon ElastiCache for Redis is used as a non-volatile cache for the key-value pairs. We measured the performance of TFR by running different Hadoop benchmarking suites. TFR was implemented using the Hadoop software framework, and the experimental results showed significant performance improvements when compared with the performance of the default Hadoop implementation.
引用
收藏
页码:751 / 763
页数:13
相关论文
共 50 条
  • [1] Improving the Shuffle of Hadoop MapReduce
    Li, Jingui
    Lin, Xuelian
    Cui, Xiaolong
    Ye, Yue
    2013 IEEE FIFTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), VOL 1, 2013, : 266 - 273
  • [2] Improving Hadoop MapReduce performance on heterogeneous single board computer clusters☆
    Lim, Sooyoung
    Park, Dongchul
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 160 : 752 - 766
  • [3] A Partitioning Technique for Improving the Performance of PageRank on Hadoop
    Choi, Hoon
    Um, Jungho
    Yoon, Hwamook
    Lee, Minho
    Choi, Yunsoo
    Lee, Wongoo
    Song, Sakwang
    Jung, Hanmin
    2012 7TH INTERNATIONAL CONFERENCE ON COMPUTING AND CONVERGENCE TECHNOLOGY (ICCCT2012), 2012, : 458 - 461
  • [4] Improving the efficiency of MapReduce scheduling algorithm in Hadoop
    Thangaselvi, R.
    Ananthbabu, S.
    Jagadeesh, S.
    Aruna, R.
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT), 2015, : 63 - 68
  • [5] Improving the Map and Shuffle Phases in Hadoop MapReduce
    Lakshmi, J. V. N.
    SMART COMPUTING AND INFORMATICS, 2018, 77 : 203 - 212
  • [6] Improving Hadoop MapReduce Performance with Data Compression: A Study using Wordcount Job
    Rattanaopas, Kritwara
    Kaewkeeree, Sureerat
    2017 14TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON), 2017, : 564 - 567
  • [7] SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters
    Gu, Rong
    Yang, Xiaoliang
    Yan, Jinshuang
    Sun, Yuanhao
    Wang, Bing
    Yuan, Chunfeng
    Huang, Yihua
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2014, 74 (03) : 2166 - 2179
  • [8] A Hadoop MapReduce Performance Prediction Method
    Song, Ge
    Meng, Zide
    Huet, Fabrice
    Magoules, Frederic
    Yu, Lei
    Lin, Xuelian
    2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), 2013, : 820 - 825
  • [9] Improving MapReduce Performance in Heterogeneous Environments with Adaptive Task Tuning
    Cheng, Dazhao
    Rao, Jia
    Guo, Yanfei
    Zhou, Xiaobo
    ACM/IFIP/USENIX MIDDLEWARE 2014, 2014, : 97 - 108
  • [10] Improving Performance of Heterogeneous MapReduce Clusters with Adaptive Task Tuning
    Cheng, Dazhao
    Rao, Jia
    Guo, Yanfei
    Jiang, Changjun
    Zhou, Xiaobo
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (03) : 774 - 786