New algorithm for computing cube on very large compressed data sets

被引:5
|
作者
IEEE [1 ]
不详 [2 ]
不详 [3 ]
机构
来源
IEEE Trans Knowl Data Eng | 2006年 / 12卷 / 1667-1680期
基金
美国国家科学基金会;
关键词
Compressed data sets - Cube algorithms - Cube operation;
D O I
10.1109/TKDE.2006.195
中图分类号
学科分类号
摘要
Data compression is an effective technique to improve the performance of data warehouses. Since cube operation represents the core of online analytical processing in data warehouses, it is a major challenge to develop efficient algorithms for computing cube on compressed data warehouses. To our knowledge, very few cube computation techniques have been proposed for compressed data warehouses to date in the literature. This paper presents a novel algorithm to compute cubes on compressed data warehouses. The algorithm operates directly on compressed data sets without the need of first decompressing them. The algorithm is applicable to a large class of mapping complete data compression methods. The complexity of the algorithm is analyzed in detail. The analytical and experimental results show that the algorithm is more efficient than all other existing cube algorithms. In addition, a heuristic algorithm to generate an optimal plan for computing cube is also proposed. © 2006 IEEE.
引用
收藏
相关论文
共 50 条
  • [21] A Bayesian spatiotemporal model for very large data sets
    Harrison, L. M.
    Green, G. G. R.
    NEUROIMAGE, 2010, 50 (03) : 1126 - 1141
  • [22] On the interactive visualization of very large image data sets
    Ekpar, Frank
    Yoneda, Masaaki
    Hase, Hiroyuki
    2007 CIT: 7TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, PROCEEDINGS, 2007, : 627 - 632
  • [23] On-line learning for very large data sets
    Bottou, U
    Le Cun, Y
    APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2005, 21 (02) : 137 - 151
  • [24] Efficient aggregation algorithms on very large compressed data warehouses
    Jianzhong Li
    Yingshu Li
    Jaideep Srivastava
    Journal of Computer Science and Technology, 2000, 15 : 213 - 229
  • [25] Efficient aggregation algorithms on very large compressed data warehouses
    Li, JZ
    Li, YS
    Srivastava, J
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2000, 15 (03) : 213 - 229
  • [27] Efficient parallel spectral clustering algorithm design for large data sets under cloud computing environment
    Jin R.
    Kou C.
    Liu R.
    Li Y.
    Journal of Cloud Computing, 2013, 2 (01)
  • [28] Selective sampling for approximate clustering of very large data sets
    Wang, Liang
    Bezdek, James C.
    Leckie, Christopher
    Kotagiri, Ramamohanarao
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2008, 23 (03) : 313 - 331
  • [29] Fixed rank kriging for very large spatial data sets
    Cressie, Noel
    Johannesson, Gardar
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 : 209 - 226
  • [30] A Geometric Approach to Train SVM on Very Large Data Sets
    Zeng, Zhi-Qiang
    Xu, Hua-Rong
    Xie, Yan-Qi
    Gao, Ji
    2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 991 - +