Answering ad hoc aggregate queries from data streams using prefix aggregate trees

被引:0
|
作者
Moonjung Cho
Jian Pei
Ke Wang
机构
[1] State University of New York at Buffalo,Department of Computer Science and Engineering
[2] Simon Fraser University,School of Computing Science
[3] 8888 University Drive,undefined
来源
关键词
Data warehousing; Data cube; Data stream; Online analytic processing (OLAP); Aggregate query;
D O I
暂无
中图分类号
学科分类号
摘要
In some business applications such as trading management in financial institutions, it is required to accurately answer ad hoc aggregate queries over data streams. Materializing and incrementally maintaining a full data cube or even its compression or approximation over a data stream is often computationally prohibitive. On the other hand, although previous studies proposed approximate methods for continuous aggregate queries, they cannot provide accurate answers. In this paper, we develop a novel prefix aggregate tree (PAT) structure for online warehousing data streams and answering ad hoc aggregate queries. Often, a data stream can be partitioned into the historical segment, which is stored in a traditional data warehouse, and the transient segment, which can be stored in a PAT to answer ad hoc aggregate queries. The size of a PAT is linear in the size of the transient segment, and only one scan of the data stream is needed to create and incrementally maintain a PAT. Although the query answering using PAT costs more than the case of a fully materialized data cube, the query answering time is still kept linear in the size of the transient segment. Our extensive experimental results on both synthetic and real data sets illustrate the efficiency and the scalability of our design.
引用
收藏
页码:301 / 329
页数:28
相关论文
共 50 条
  • [41] Queries with aggregate functions over fuzzy RDF data
    Ma, Zongmin
    Zhang, Xiaowen
    Zhao, Yuhan
    JOURNAL OF SUPERCOMPUTING, 2023, 79 (13): : 14780 - 14807
  • [42] Estimating range queries using aggregate data with integrity constraints:: A probabilistic approach
    Buccafurri, F
    Furfaro, F
    Saccà, D
    DATABASE THEORY - ICDT 2001, PROCEEDINGS, 2001, 1973 : 390 - 404
  • [43] Ad-Hoc Georeferencing of Web-Pages Using Street-Name Prefix Trees
    Tabarcea, Andrei
    Hautamaeki, Ville
    Franti, Pasi
    WEB INFORMATION SYSTEMS AND TECHNOLOGIES, 2011, 75 : 259 - +
  • [44] AD-HOC GEOREFERENCING OF WEB-PAGES USING STREET-NAME PREFIX TREES
    Tabarcea, Andrei
    Hautamaki, Ville
    Franti, Pasi
    WEBIST 2010: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGY, VOL 1, 2010, : 237 - 244
  • [45] Fast Ad Hoc Queries Based on Data Ontologies
    Barzdins, Janis
    Rencis, Edgars
    Sostaks, Agris
    DATABASES AND INFORMATION SYSTEMS VIII, 2014, 270 : 43 - 56
  • [46] Efficient Incrementialization of Correlated Nested Aggregate Queries using Relative Partial Aggregate Indexes (RPAI)
    Abeysinghe, Supun
    He, Qiyang
    Rompf, Tiark
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 136 - 149
  • [47] Learning Market Parameters Using Aggregate Demand Queries
    Bei, Xiaohui
    Chen, Wei
    Garg, Jugal
    Hoefer, Martin
    Sun, Xiaoming
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 404 - 410
  • [48] AGGREGATE INFORMATION EFFICIENCY IN WIRELESS AD HOC NETWORKS WITH OUTAGE CONSTRAINTS
    Henrique, Pedro
    Nardelli, Juliano
    Cardieri, Paulo
    2008 IEEE 9TH WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS, VOLS 1 AND 2, 2008, : 256 - 260
  • [49] Aggregate information efficiency and packet delay in wireless ad hoc networks
    Nardelli, P. H. J.
    Cardieri, P.
    WCNC 2008: IEEE WIRELESS COMMUNICATIONS & NETWORKING CONFERENCE, VOLS 1-7, 2008, : 1391 - 1396
  • [50] Analysis of the Simulated Aggregate Interference in Random Ad-hoc networks
    Richter, Yiftach
    Bergel, Itsik
    2014 IEEE 15TH INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS (SPAWC), 2014, : 374 - 378