New algorithm for computing cube on very large compressed data sets

被引:5
|
作者
IEEE [1 ]
不详 [2 ]
不详 [3 ]
机构
来源
IEEE Trans Knowl Data Eng | 2006年 / 12卷 / 1667-1680期
基金
美国国家科学基金会;
关键词
Compressed data sets - Cube algorithms - Cube operation;
D O I
10.1109/TKDE.2006.195
中图分类号
学科分类号
摘要
Data compression is an effective technique to improve the performance of data warehouses. Since cube operation represents the core of online analytical processing in data warehouses, it is a major challenge to develop efficient algorithms for computing cube on compressed data warehouses. To our knowledge, very few cube computation techniques have been proposed for compressed data warehouses to date in the literature. This paper presents a novel algorithm to compute cubes on compressed data warehouses. The algorithm operates directly on compressed data sets without the need of first decompressing them. The algorithm is applicable to a large class of mapping complete data compression methods. The complexity of the algorithm is analyzed in detail. The analytical and experimental results show that the algorithm is more efficient than all other existing cube algorithms. In addition, a heuristic algorithm to generate an optimal plan for computing cube is also proposed. © 2006 IEEE.
引用
收藏
相关论文
共 50 条
  • [41] Compressed constrained spectral clustering framework for large-scale data sets
    Liu, Wenfen
    Ye, Mao
    Wei, Jianghong
    Hu, Xuexian
    KNOWLEDGE-BASED SYSTEMS, 2017, 135 : 77 - 88
  • [42] A New Clustering Algorithm On Nominal Data Sets
    Wang, Bin
    INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS (IMECS 2010), VOLS I-III, 2010, : 605 - 610
  • [43] Querying XML Data Sources That Export Very Large Sets of Views
    Cautis, Bogdan
    Deutsch, Alin
    Onose, Nicola
    Vassalos, Vasilis
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2011, 36 (01):
  • [44] Exploring Very Large Data Sets from Online Social Networks
    Almeida, Virgilio
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), 2013, : 1165 - 1165
  • [45] Progressive sampling schemes for approximate clustering in very large data sets
    Bezdek, JC
    Hathaway, RJ
    2004 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, PROCEEDINGS, 2004, : 15 - 21
  • [46] LBGS:: a smart approach for very large data sets vector quantization
    Campobello, G
    Mantineo, M
    Patanè, G
    Russo, M
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2005, 20 (01) : 91 - 114
  • [47] Empirical modeling of very large data sets using neural networks
    Owens, AJ
    IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL VI, 2000, : 302 - 307
  • [48] A strategy for compression and analysis of very large remote sensing data sets
    Braverman, A
    NONLINEAR ESTIMATION AND CLASSIFICATION, 2003, 171 : 429 - 441
  • [49] A HYBRID STRUCTURE FOR THE STORAGE AND MANIPULATION OF VERY LARGE SPATIAL DATA SETS
    PEUQUET, DJ
    COMPUTER VISION GRAPHICS AND IMAGE PROCESSING, 1983, 24 (01): : 14 - 27
  • [50] Compressed data cube for approximate OLAP query processing
    Feng, Y
    Wang, S
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2002, 17 (05) : 625 - 635