A general analytical model for spatial and temporal performance of bitmap index compression algorithms in Big Data

被引:0
|
作者
Wu, Yinjun [1 ]
Chen, Zhen [1 ]
Wen, Yuhao [1 ]
Cao, Junwei [1 ]
Zheng, Wenxun [1 ]
Ma, Ge [1 ]
机构
[1] Tsinghua Univ, Res Inst Informat Technol, Tsinghua Natl Lab Informat Sci & Technol TNList, Beijing, Peoples R China
关键词
bitmap index; Big Data; COMBAT; SECOMPAX; CONCISE; data compression; performance evaluation;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Bitmap indexing is flexible to conduct boolean operations in data retrieval. Besides, the query processing based on bitmap indexing is also very fast. Therefore it has been widely used in various big data analytics platforms, such as Druid and Spark etc. However, bitmap index can consume a large amount of memory, which leads to the invention of different kinds of bitmap index compression algorithms without sacrificing temporal performance. In practice, we are often discommoded by choosing a proper algorithm when handling specific problems. Besides, after devising a new algorithm that may outperform existing ones, it is essential to evaluate its performance in theory. Without appropriate theoretical analysis, the deficit of a new algorithm can only be spotted until final experimental results are drawn, thus wasting much time and effort. In this paper, we propose a general analytical model to analyze both the spatial and temporal performance for bitmap index compression algorithms, which can be applied to analyze all kinds of algorithms derived from WAH (word-aligned hybrid). In this model, two types of distributed bitmaps, uniformly distributed bitmaps and clustered bitmaps, are used separately. In order to illustrate this model, several bitmap index compression algorithms are analyzed and compared with each other. Algorithms herein are COMBAT (COMbining Binary And Ternary encoding), SECOMPAX (Scope Extended COMPAX) and CONCISE (Compressed 'n' Composable Integer Set), which are all derived from WAH. Evaluation results by MATLAB simulation about these algorithms are also presented. This paper paves the way for further researches on the performance evaluation of various bitmap index compression algorithms in the future.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Spatial and temporal epidemiological analysis in the Big Data era
    Pfeiffer, Dirk U.
    Stevens, Kim B.
    PREVENTIVE VETERINARY MEDICINE, 2015, 122 (1-2) : 213 - 220
  • [22] Performance of lossless compression algorithms on voiceband data
    Ng, KW
    Pollard, AJ
    Dacombe, LR
    McLeod, RD
    Card, HC
    1996 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING - CONFERENCE PROCEEDINGS, VOLS I AND II: THEME - GLIMPSE INTO THE 21ST CENTURY, 1996, : 206 - 209
  • [24] High Performance Analysis of Big Spatial Data
    Haynes, David
    Ray, Suprio
    Manson, Steven M.
    Soni, Ankit
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 1953 - 1957
  • [25] A Performance Study of Big Spatial Data Systems
    Alam, Md Mahbub
    Ray, Suprio
    Bhavsar, Virendra C.
    BIGSPATIAL 2018: PROCEEDINGS OF THE 7TH ACM SIGSPATIAL INTERNATIONAL WORKSHOP ON ANALYTICS FOR BIG GEOSPATIAL DATA (BIGSPATIAL-2018), 2018, : 1 - 9
  • [26] Algorithms, machine intelligence, big data. General considerations
    Radermacher, F. J.
    BUNDESGESUNDHEITSBLATT-GESUNDHEITSFORSCHUNG-GESUNDHEITSSCHUTZ, 2015, 58 (08) : 859 - 865
  • [27] Analytical methods and applications of spatial interactions in the era of big data
    Liu Y.
    Yao X.
    Gong Y.
    Kang C.
    Shi X.
    Wang F.
    Wang J.
    Zhang Y.
    Zhao P.
    Zhu D.
    Zhu X.
    Dili Xuebao/Acta Geographica Sinica, 2020, 75 (07): : 1523 - 1538
  • [28] Routing Optimization Algorithms Based on Node Compression in Big Data Environment
    Yang, Lifeng
    Chen, Liangming
    Wang, Ningwei
    Liao, Zhifang
    SCIENTIFIC PROGRAMMING, 2017, 2017
  • [29] GENERAL PERFORMANCE INDEX FOR ANALYTICAL DESIGN OF CONTROL SYSTEMS
    REKASIUS, ZV
    IRE TRANSACTIONS ON AUTOMATIC CONTROL, 1961, AC 6 (02): : 217 - &
  • [30] Ontology Matching Algorithms for Data Model Alignment in Big Data
    Frimpong, Ruth Achiaa
    SEMANTIC WEB, ESWC 2017, PT II, 2017, 10250 : 195 - 204