Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Framework

被引:26
|
作者
Zhao, Yaxiong [1 ,2 ]
Wu, Jie [2 ]
Liu, Cong [3 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
[2] Temple Univ, Philadelphia, PA 19122 USA
[3] Sun Yat Sen Univ, Guangzhou 510275, Guangdong, Peoples R China
关键词
big-data; MapReduce; Hadoop; caching;
D O I
10.1109/TST.2014.6733207
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The buzz-word big-data refers to the large-scale distributed data processing applications that operate on exceptionally large amounts of data. Google's MapReduce and Apache's Hadoop, its open-source implementation, are the defacto software systems for big-data applications. An observation of the MapReduce framework is that the framework generates a large amount of intermediate data. Such abundant information is thrown away after the tasks finish, because MapReduce is unable to utilize them. In this paper, we propose Dache, a data-aware cache framework for big-data applications. In Dache, tasks submit their intermediate results to the cache manager. A task queries the cache manager before executing the actual computing work. A novel cache description scheme and a cache request and reply protocol are designed. We implement Dache by extending Hadoop. Testbed experiment results demonstrate that Dache significantly improves the completion time of MapReduce jobs.
引用
收藏
页码:39 / 50
页数:12
相关论文
共 50 条
  • [1] Dache: A Data Aware Caching for Big-Data Applications Using The MapReduce Framework
    Zhao, Yaxiong
    Wu, Jie
    2013 PROCEEDINGS IEEE INFOCOM, 2013, : 35 - 39
  • [2] Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Framework
    Yaxiong Zhao
    Jie Wu
    Cong Liu
    TsinghuaScienceandTechnology, 2014, 19 (01) : 39 - 50
  • [3] Dache: A data aware caching for big-data applications using the MapReduce framework
    Zhao, Y. (yaxiongzhao@google.com), 1600, Tsinghua University (19):
  • [4] A Survey on Geographically Distributed Big-Data Processing Using MapReduce
    Dolev, Shlomi
    Florissi, Patricia
    Gudes, Ehud
    Sharma, Shantanu
    Singer, Ido
    IEEE TRANSACTIONS ON BIG DATA, 2019, 5 (01) : 60 - 80
  • [5] Secure Scalar Product for Big-Data in MapReduce
    Liu, Fang
    Ng, Wee Keong
    Zhang, Wei
    2015 IEEE FIRST INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE 2015), 2015, : 120 - 129
  • [6] Big Data Analysis Solutions using MapReduce Framework
    Elagib, Sara B.
    Najeeb, Atahur Rahman
    Hashim, Aisha H.
    Olanrewaju, Rashidah F.
    2014 INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING (ICCCE), 2014, : 127 - 130
  • [7] Energy-Aware Scheduling of MapReduce Jobs for Big Data Applications
    Mashayekhy, Lena
    Nejad, Mahyar Movahed
    Grosu, Daniel
    Zhang, Quan
    Shi, Weisong
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (10) : 2720 - 2733
  • [8] On Traffic-Aware Partition and Aggregation in MapReduce for Big Data Applications
    Ke, Huan
    Li, Peng
    Guo, Song
    Guo, Minyi
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (03) : 818 - 828
  • [9] A big-data processing framework for uncertainties in transportation data
    Yang, Jie
    Ma, Jun
    2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [10] Big-Data Applications in the Government Sector
    Kim, Gang-Hoon
    Trimi, Silvana
    Chung, Ji-Hyong
    COMMUNICATIONS OF THE ACM, 2014, 57 (03) : 78 - 85