Efficient aggregation algorithms on very large compressed data warehouses

被引:2
|
作者
Li, JZ [1 ]
Li, YS
Srivastava, J
机构
[1] Harbin Inst Technol, Dept Comp Sci & Engn, Harbin 150001, Peoples R China
[2] Beijing Inst Technol, Beijing 100876, Peoples R China
[3] Univ Minnesota, Minneapolis, MN 55455 USA
基金
中国国家自然科学基金;
关键词
OLAP; aggregation; data warehouse;
D O I
10.1007/BF02948809
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Multidimensional aggregation is a dominant operation on data warehouses for on-line analytical processing (OLAP). Many efficient algorithms to compute multidimensional aggregation on relational database based data warehouses have been developed. However, to our knowledge, there is nothing to date in the literature about aggregation algorithms on multidimensional data warehouses that store datasets in multidimensional arrays rather than in tables. This paper presents a set of multidimensional aggregation algorithms on very large and compressed multidimensional data warehouses. These algorithms operate directly on compressed datasets in multidimensional data warehouses without the need to first decompress them. They are applicable to a variety of data compression methods. The algorithms have different performance behavior as a function of dataset parameters, sizes of outputs and main memory availability. The algorithms are described and analyzed with respect to the I/O and CPU costs. A decision procedure to select the most efficient algorithm, given an aggregation request, is also proposed. The analytical and experimental results show that the algorithms are more efficient than the traditional aggregation algorithms.
引用
收藏
页码:213 / 229
页数:17
相关论文
共 50 条
  • [21] Efficient data representation for a very large pharmaceutical data repository
    Ben-Miled, Z
    Zaitsev, A
    Bukhres, O
    Bem, M
    Jones, R
    Oppelt, R
    COMPUTERS AND THEIR APPLICATIONS, 2000, : 140 - 145
  • [22] Efficient maintenance of temporal data warehouses
    de Amo, Sandra
    Halfeld Ferrari Alves, Mirian
    Proceedings of the International Database Engineering and Applications Symposium, IDEAS, 2000, : 188 - 196
  • [23] Efficient Document Analytics on Compressed Data: Method, Challenges, Algorithms, Insights
    Zhang, Feng
    Zhai, Jidong
    Shen, Xipeng
    Mutlu, Onur
    Chen, Wenguang
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (11): : 1522 - 1535
  • [24] Handling Very Large Workloads to Effectively Partition Data Warehouses: New Approach and Experimental Study
    Gacem, Amina
    Boukhalfa, Kamel
    2014 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2014,
  • [25] Fuzzy c-Means Algorithms for Very Large Data
    Havens, Timothy C.
    Bezdek, James C.
    Leckie, Christopher
    Hall, Lawrence O.
    Palaniswami, Marimuthu
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2012, 20 (06) : 1130 - 1146
  • [26] Performance optimization of grid aggregation in spatial data warehouses
    Kang, Myoung-Ah
    Zaamoune, Mehdi
    Pinet, Francois
    Bimonte, Sandro
    Beaune, Philippe
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2015, 8 (12) : 970 - 988
  • [27] Energy-efficient compressed data aggregation in underwater acoustic sensor networks
    Hongzhi Lin
    Wei Wei
    Ping Zhao
    Xiaoqiang Ma
    Rui Zhang
    Wenping Liu
    Tianping Deng
    Kai Peng
    Wireless Networks, 2016, 22 : 1985 - 1997
  • [28] Energy-efficient compressed data aggregation in underwater acoustic sensor networks
    Lin, Hongzhi
    Wei, Wei
    Zhao, Ping
    Ma, Xiaoqiang
    Zhang, Rui
    Liu, Wenping
    Deng, Tianping
    Peng, Kai
    WIRELESS NETWORKS, 2016, 22 (06) : 1985 - 1997
  • [29] AGGREGATION BY VERY LARGE NUMBERS
    MADDOX, J
    NATURE, 1985, 318 (6043) : 229 - 229
  • [30] Efficient OLAP operations in spatial data warehouses
    Papadias, D
    Kalnis, P
    Zhang, J
    Tao, YF
    ADVANCES IN SPATIAL AND TEMPORAL DATABASES, PROCEEDINGS, 2001, 2121 : 443 - 459