New algorithm for computing cube on very large compressed data sets

被引：5

作者：

IEEE ^{[1
]}

不详 ^{[2
]}

不详 ^{[3
]}

机构：

来源：

IEEE Trans Knowl Data Eng | 2006年 / 12卷 / 1667-1680期

基金：

美国国家科学基金会;

关键词：

Compressed data sets - Cube algorithms - Cube operation;

D O I：

10.1109/TKDE.2006.195

中图分类号：

学科分类号：

摘要：

Data compression is an effective technique to improve the performance of data warehouses. Since cube operation represents the core of online analytical processing in data warehouses, it is a major challenge to develop efficient algorithms for computing cube on compressed data warehouses. To our knowledge, very few cube computation techniques have been proposed for compressed data warehouses to date in the literature. This paper presents a novel algorithm to compute cubes on compressed data warehouses. The algorithm operates directly on compressed data sets without the need of first decompressing them. The algorithm is applicable to a large class of mapping complete data compression methods. The complexity of the algorithm is analyzed in detail. The analytical and experimental results show that the algorithm is more efficient than all other existing cube algorithms. In addition, a heuristic algorithm to generate an optimal plan for computing cube is also proposed. © 2006 IEEE.

引用

共 50 条

[1] New algorithm for computing cube on very large compressed data sets
Wu, Weili
Gao, Hong
Li, Jianzhong
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (12) : 1667 - 1680
[2] Cube algorithms for very large compressed data warehouses
Gao, H.
Li, J.
Ruan Jian Xue Bao/Journal of Software, 2001, 12 (06): : 830 - 839
[3] A genetic algorithm for clustering on very large data sets
Gasvoda, J
Ding, Q
COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, 2003, : 163 - 167
[4] Fast SVM training algorithm with decomposition on very large data sets
Dong, JX
Krzyzak, A
Suen, CY
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (04) : 603 - 618
[5] DESCRY: A density based clustering algorithm for very large data sets
Angiulli, F
Pizzuti, C
Ruffolo, M
INTELLIGENT DAA ENGINEERING AND AUTOMATED LEARNING IDEAL 2004, PROCEEDINGS, 2004, 3177 : 203 - 210
[6] Data mining from extreme data sets: Very large and/or very skewed data sets
Hall, LO
2001 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: E-SYSTEMS AND E-MAN FOR CYBERNETICS IN CYBERSPACE, 2002, : 2555 - 2555
[7] Joining very large data sets
Johnson, T
Chatziantoniou, D
DATABASES IN TELECOMMUNICATIONS, 2000, 1819 : 118 - 132
[8] PCA and PLS with very large data sets
Kettaneh, N
Berglund, A
Wold, S
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2005, 48 (01) : 69 - 85
[9] Clustering Very Large Dissimilarity Data Sets
Hammer, Barbara
Hasenfuss, Alexander
ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, PROCEEDINGS, 2010, 5998 : 259 - +
[10] Aggregation algorithms for very large compressed data warehouses
Li, JZ
Rotem, D
Srivastava, J
PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, 1999, : 651 - 662

← 1 2 3 4 5 →