HaoLap: A Hadoop based OLAP system for big data

被引:30
|
作者
Song, Jie [1 ]
Guo, Chaopeng [1 ]
Wang, Zhi [1 ]
Zhang, Yichan [1 ]
Yu, Ge [2 ]
Pierson, Jean-Marc [3 ]
机构
[1] Northeastern Univ, Software Coll, Shenyang 110819, Peoples R China
[2] Northeastern Univ, Sch Informat & Engn, Shenyang 110819, Peoples R China
[3] Univ Toulouse 3, Lab IRIT, F-31062 Toulouse, France
基金
新加坡国家研究基金会; 中国国家自然科学基金;
关键词
Cloud data warehouse; Multidimensional data model; MapReduce;
D O I
10.1016/j.jss.2014.09.024
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In recent years, facing information explosion, industry and academia have adopted distributed file system and MapReduce programming model to address new challenges the big data has brought. Based on these technologies, this paper presents HaoLap (Hadoop based oLap), an OLAP (OnLine Analytical Processing) system for big data. Drawing on the experience of Multidimensional OLAP (MOLAP), HaoLap adopts the specified multidimensional model to map the dimensions and the measures; the dimension coding and traverse algorithm to achieve the roll up operation on dimension hierarchy; the partition and linearization algorithm to store dimensions and measures; the chunk selection algorithm to optimize OLAP performance; and MapReduce to execute OLAP. The paper illustrates the key techniques of HaoLap including system architecture, dimension definition, dimension coding and traversing, partition, data storage, OLAP and data loading algorithm. We evaluated HaoLap on a real application and compared it with Hive, HadoopDB, HBaseLattice, and Olap4Cloud. The experiment results show that HaoLap boost the efficiency of data loading, and has a great advantage in the OLAP performance of the data set size and query complexity, and meanwhile HaoLap also completely support dimension operations. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:167 / 181
页数:15
相关论文
共 50 条
  • [21] Retraction Note: Research on intelligent medical big data system based on Hadoop and blockchain
    Xiangfeng Zhang
    Yanmei Wang
    EURASIP Journal on Wireless Communications and Networking, 2023
  • [22] Design of big data processing system architecture based on Hadoop Under the cloud computing
    Duan, Chunmei
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 6302 - 6306
  • [23] RETRACTED ARTICLE: Research on intelligent medical big data system based on Hadoop and blockchain
    Xiangfeng Zhang
    Yanmei Wang
    EURASIP Journal on Wireless Communications and Networking, 2021
  • [24] Big Data Performance Analysis on a Hadoop Distributed File System Based on Geometric Data Perturbation Technique
    Marichamy, V. Santhana
    Natarajan, V.
    2ND INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ADVANCED COMPUTING ICRTAC -DISRUP - TIV INNOVATION , 2019, 2019, 165 : 415 - 420
  • [25] Scheduling in Big Data Heterogeneous Distributed System Using Hadoop
    Thakkar, Shraddha
    Patel, Sanjay
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ICT FOR SUSTAINABLE DEVELOPMENT ICT4SD 2015, VOL 2, 2016, 409 : 119 - 131
  • [26] Big Data Analytics- Recommendation System with Hadoop Framework
    Kadam, Sayali D.
    Motwani, Dilip
    Vaidya, Siddhesh A.
    2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 3, 2015, : 906 - 910
  • [27] Haery: A Hadoop Based Query System on Accumulative and High-Dimensional Data Model for Big Data
    Song, Jie
    He, HongYan
    Thomas, Richard
    Bao, Yubin
    Yu, Ge
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (07) : 1362 - 1377
  • [28] What-If Query Processing Policy for Big Data in OLAP System
    Xu, Huan
    Luo, Hao
    He, Jieyue
    2013 INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2013, : 110 - 116
  • [29] OLAP*: Effectively and Efficiently Supporting Parallel OLAP over Big Data
    Cuzzocrea, Alfredo
    Moussa, Rim
    Xu, Guandong
    MODEL AND DATA ENGINEERING, MEDI 2013, 2013, 8216 : 38 - 49
  • [30] Forensic Investigation through Data Remnants on Hadoop Big Data Storage System
    Oo, Myat Nandar
    Parvin, Sazia
    Thein, Thandar
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2018, 33 (03): : 203 - 217