Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Framework

被引:26
|
作者
Zhao, Yaxiong [1 ,2 ]
Wu, Jie [2 ]
Liu, Cong [3 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
[2] Temple Univ, Philadelphia, PA 19122 USA
[3] Sun Yat Sen Univ, Guangzhou 510275, Guangdong, Peoples R China
关键词
big-data; MapReduce; Hadoop; caching;
D O I
10.1109/TST.2014.6733207
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The buzz-word big-data refers to the large-scale distributed data processing applications that operate on exceptionally large amounts of data. Google's MapReduce and Apache's Hadoop, its open-source implementation, are the defacto software systems for big-data applications. An observation of the MapReduce framework is that the framework generates a large amount of intermediate data. Such abundant information is thrown away after the tasks finish, because MapReduce is unable to utilize them. In this paper, we propose Dache, a data-aware cache framework for big-data applications. In Dache, tasks submit their intermediate results to the cache manager. A task queries the cache manager before executing the actual computing work. A novel cache description scheme and a cache request and reply protocol are designed. We implement Dache by extending Hadoop. Testbed experiment results demonstrate that Dache significantly improves the completion time of MapReduce jobs.
引用
收藏
页码:39 / 50
页数:12
相关论文
共 50 条
  • [21] Optimizing Read-Once Data Flow in Big-Data Applications
    Morad, Tomer Y.
    Shomron, Gil
    Erez, Mattan
    Kolodny, Avinoam
    Weiser, Uri C.
    IEEE COMPUTER ARCHITECTURE LETTERS, 2017, 16 (01) : 68 - 71
  • [22] Big-Data Framework for Electric Vehicle Range Estimation
    Rahimi-Eichi, Habiballah
    Chow, Mo-Yuen
    IECON 2014 - 40TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2014, : 5628 - 5634
  • [23] A Framework for Aligning Big-Data Projects with Organizational Strategy
    Lakoju, Mike
    Serrano, Alan
    AMCIS 2017 PROCEEDINGS, 2017,
  • [24] Big data classification with optimization driven MapReduce framework
    Mohammed, Mujeeb Shaik
    Rachapudy, Praveen Sam
    Kasa, Madhavi
    INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2021, 25 (02) : 173 - 183
  • [25] Big-Data Visualization
    Keim, Daniel
    Qu, Huamin
    Ma, Kwan-Liu
    IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2013, 33 (04) : 20 - 21
  • [26] From photons to big-data applications: terminating terabits
    Zilberman, Noa
    Moore, Andrew W.
    Crowcroft, Jon A.
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2016, 374 (2062):
  • [27] Advances in modelling and simulation for big-data applications (AMSBA)
    Pop, Florin
    Iacono, Mauro
    Gribaudo, Marco
    Kolodziej, Joanna
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (02): : 291 - 293
  • [28] Big data analytics for retail industry using MapReduce-Apriori framework
    Verma, Neha
    Malhotra, Dheeraj
    Singh, Jatinder
    JOURNAL OF MANAGEMENT ANALYTICS, 2020, 7 (03) : 424 - 442
  • [29] A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming
    Natesan P.
    Sathishkumar V.E.
    Mathivanan S.K.
    Venkatasen M.
    Jayagopal P.
    Allayear S.M.
    Mathematical Problems in Engineering, 2023, 2023
  • [30] A Novel Big-Data Processing Framwork for Healthcare Applications Big-Data-Healthcare-in-a-Box
    Rahman, Fuad
    Slepian, Marvin
    Mitra, Ari
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 3548 - 3555