Efficient aggregation algorithms on very large compressed data warehouses

被引:2
|
作者
Li, JZ [1 ]
Li, YS
Srivastava, J
机构
[1] Harbin Inst Technol, Dept Comp Sci & Engn, Harbin 150001, Peoples R China
[2] Beijing Inst Technol, Beijing 100876, Peoples R China
[3] Univ Minnesota, Minneapolis, MN 55455 USA
基金
中国国家自然科学基金;
关键词
OLAP; aggregation; data warehouse;
D O I
10.1007/BF02948809
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Multidimensional aggregation is a dominant operation on data warehouses for on-line analytical processing (OLAP). Many efficient algorithms to compute multidimensional aggregation on relational database based data warehouses have been developed. However, to our knowledge, there is nothing to date in the literature about aggregation algorithms on multidimensional data warehouses that store datasets in multidimensional arrays rather than in tables. This paper presents a set of multidimensional aggregation algorithms on very large and compressed multidimensional data warehouses. These algorithms operate directly on compressed datasets in multidimensional data warehouses without the need to first decompress them. They are applicable to a variety of data compression methods. The algorithms have different performance behavior as a function of dataset parameters, sizes of outputs and main memory availability. The algorithms are described and analyzed with respect to the I/O and CPU costs. A decision procedure to select the most efficient algorithm, given an aggregation request, is also proposed. The analytical and experimental results show that the algorithms are more efficient than the traditional aggregation algorithms.
引用
收藏
页码:213 / 229
页数:17
相关论文
共 50 条
  • [31] Reliability Design for Large Scale Data Warehouses
    Du, Kai
    Hu, Zhengbing
    Wang, Huaimin
    Chen, Yingwen
    Yang, Shuqiang
    Yuan, Zhijian
    JOURNAL OF COMPUTERS, 2008, 3 (10) : 78 - 85
  • [32] Efficient data dissemination and aggregation in large wireless sensor networks
    Youn, JH
    Kalva, RR
    Park, S
    VTC2004-FALL: 2004 IEEE 60TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS 1-7: WIRELESS TECHNOLOGIES FOR GLOBAL SECURITY, 2004, : 4602 - 4606
  • [33] Efficient MLFMA, RPFMA, and FAFFA algorithms for EM scattering by very large structures
    Cui, TJ
    Chew, WC
    Chen, G
    Song, JM
    IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2004, 52 (03) : 759 - 770
  • [34] An Efficient Method of Data Inconsistency Check for Very Large Relations
    Sug, Hyontai
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2007, 7 (10): : 167 - 169
  • [35] EFFICIENT ALGORITHMS FOR HIGHLY COMPRESSED DATA: THE WORD PROBLEM IN HIGMAN'S GROUP IS IN P
    Diekert, Volker
    Laun, Juern
    Ushakov, Alexander
    INTERNATIONAL JOURNAL OF ALGEBRA AND COMPUTATION, 2012, 22 (08)
  • [36] Efficient Parameters for Compressed Sensing Recovery Algorithms
    Wafaa A. Shalaby
    Waleed Saad
    Mona Shokair
    Moawad Dessouky
    Wireless Personal Communications, 2017, 94 : 1715 - 1736
  • [37] Efficient algorithms to compute compressed longest common substrings and compressed palindromes
    Matsubara, Wataru
    Inenaga, Shunsuke
    Ishino, Akira
    Shinohara, Ayumi
    Nakamura, Tomoyuki
    Hashimoto, Kazuo
    THEORETICAL COMPUTER SCIENCE, 2009, 410 (8-10) : 900 - 913
  • [38] Efficient algorithms for mining outliers from large data sets
    Ramaswamy, S
    Rastogi, R
    Shim, K
    SIGMOD RECORD, 2000, 29 (02) : 427 - 438
  • [39] Efficient algorithms for highly compressed data: The Word Problem in Higman's group is in P
    Diekert, Volker
    Laun, Juern
    Ushakov, Alexander
    29TH INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE, (STACS 2012), 2012, 14 : 218 - 229
  • [40] Efficient Algorithms for Highly Compressed Data: The Word Problem in Generalized Higman Groups Is in P
    Laun, Juern
    THEORY OF COMPUTING SYSTEMS, 2014, 55 (04) : 742 - 770