New algorithm for computing cube on very large compressed data sets

被引:5
|
作者
IEEE [1 ]
不详 [2 ]
不详 [3 ]
机构
来源
IEEE Trans Knowl Data Eng | 2006年 / 12卷 / 1667-1680期
基金
美国国家科学基金会;
关键词
Compressed data sets - Cube algorithms - Cube operation;
D O I
10.1109/TKDE.2006.195
中图分类号
学科分类号
摘要
Data compression is an effective technique to improve the performance of data warehouses. Since cube operation represents the core of online analytical processing in data warehouses, it is a major challenge to develop efficient algorithms for computing cube on compressed data warehouses. To our knowledge, very few cube computation techniques have been proposed for compressed data warehouses to date in the literature. This paper presents a novel algorithm to compute cubes on compressed data warehouses. The algorithm operates directly on compressed data sets without the need of first decompressing them. The algorithm is applicable to a large class of mapping complete data compression methods. The complexity of the algorithm is analyzed in detail. The analytical and experimental results show that the algorithm is more efficient than all other existing cube algorithms. In addition, a heuristic algorithm to generate an optimal plan for computing cube is also proposed. © 2006 IEEE.
引用
收藏
相关论文
共 50 条
  • [1] New algorithm for computing cube on very large compressed data sets
    Wu, Weili
    Gao, Hong
    Li, Jianzhong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (12) : 1667 - 1680
  • [2] Cube algorithms for very large compressed data warehouses
    Gao, H.
    Li, J.
    Ruan Jian Xue Bao/Journal of Software, 2001, 12 (06): : 830 - 839
  • [3] A genetic algorithm for clustering on very large data sets
    Gasvoda, J
    Ding, Q
    COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, 2003, : 163 - 167
  • [4] Fast SVM training algorithm with decomposition on very large data sets
    Dong, JX
    Krzyzak, A
    Suen, CY
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (04) : 603 - 618
  • [5] DESCRY: A density based clustering algorithm for very large data sets
    Angiulli, F
    Pizzuti, C
    Ruffolo, M
    INTELLIGENT DAA ENGINEERING AND AUTOMATED LEARNING IDEAL 2004, PROCEEDINGS, 2004, 3177 : 203 - 210
  • [6] Data mining from extreme data sets: Very large and/or very skewed data sets
    Hall, LO
    2001 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: E-SYSTEMS AND E-MAN FOR CYBERNETICS IN CYBERSPACE, 2002, : 2555 - 2555
  • [7] Joining very large data sets
    Johnson, T
    Chatziantoniou, D
    DATABASES IN TELECOMMUNICATIONS, 2000, 1819 : 118 - 132
  • [8] PCA and PLS with very large data sets
    Kettaneh, N
    Berglund, A
    Wold, S
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2005, 48 (01) : 69 - 85
  • [9] Clustering Very Large Dissimilarity Data Sets
    Hammer, Barbara
    Hasenfuss, Alexander
    ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, PROCEEDINGS, 2010, 5998 : 259 - +
  • [10] Aggregation algorithms for very large compressed data warehouses
    Li, JZ
    Rotem, D
    Srivastava, J
    PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, 1999, : 651 - 662