When Good-Enough is Enough: Complex Queries at Fixed Cost

被引:4
|
作者
Mickulicz, Nathan D. [1 ]
Martins, Rolando [1 ]
Narasimhan, Priya [1 ]
Gandhi, Rajecv [1 ]
机构
[1] Carnegie Mellon Univ, Dept Elect & Comp Engn, Pittsburgh, PA 15213 USA
关键词
D O I
10.1109/BigDataService.2015.24
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Collections of time-series data appear in a wide variety of contexts. To gain insight into the underlying phenomenon (that the data represents), one must analyze the time-series data. Analysis can quickly become challenging for very large data (similar to terabytes or more) sets, and it may be infeasible to scan the entire data-set on each query due to time limits or resource constraints. To avoid this problem, one might pre-compute partial results by scanning the data-set (usually as the data arrives). However, for complex queries, where the value of a new data record depends on all of the data previously seen, this might be infeasible because incorporating a large amount of historical data into a query requires a large amount of storage. We present an approach to performing complex queries over very large data-sets in a manner that is (i) practical, meaning that a query does not require a scan of the entire data-set, and (ii) fixed-cost, meaning that the amount of storage required only depends on the time-range spanned by the entire data-set (and not the size of the data-set itself). We evaluate our approach with three different data-sets: (i) a 4-year commercial analytics data-set from a production content-delivery platform with over 15 million mobile users, (ii) an 18-year data-set from the Linux-kernel commit-history, and (iii) an 8-day data-set from Common Crawl HTTP logs. Our evaluation demonstrates the feasibility and practicality of our approach for a diverse set of complex queries on a diverse set of very large data-sets.
引用
收藏
页码:89 / 98
页数:10
相关论文
共 50 条
  • [31] Good-Enough RFLP Matcher (GERM) program
    Ian A. Dickie
    Peter G. Avis
    David J. McLaughlin
    Peter B. Reich
    Mycorrhiza, 2003, 13 : 171 - 172
  • [32] When is good enough good enough? On software assurances
    Ellul J.
    Pace G.J.
    Revolidis I.
    Schneider G.
    ERA Forum, 2023, 23 (3) : 337 - 360
  • [33] Struggling with perfectionism: When good enough is not good enough
    Overholser, James
    Dimaggio, Giancarlo
    JOURNAL OF CLINICAL PSYCHOLOGY, 2020, 76 (11) : 2019 - 2027
  • [34] The battle for China's good-enough market
    Gadiesh, Orit
    Leung, Philip
    Vestring, Till
    HARVARD BUSINESS REVIEW, 2007, 85 (09) : 80 - +
  • [35] Good-Enough RFLP Matcher (GERM) program
    Dickie, IA
    Avis, PG
    McLaughlin, DJ
    Reich, PB
    MYCORRHIZA, 2003, 13 (03) : 171 - 172
  • [36] When Good Was Not Enough
    Kolade, Victor Olaolu
    AMERICAN JOURNAL OF MEDICAL QUALITY, 2024, 39 (03) : 131 - 132
  • [37] When Enough Is More Than Good Enough
    Sharpe, Jody
    COMMUNICATIONS OF THE ACM, 2015, 58 (09) : 9 - 9
  • [38] 'Normal', 'natural', 'good' or 'good-enough' birth: examining the concepts
    Darra, Susanne
    NURSING INQUIRY, 2009, 16 (04) : 297 - 305
  • [39] RATIONALITY IN PSYCHOLOGICAL-RESEARCH - THE GOOD-ENOUGH PRINCIPLE
    SERLIN, RC
    LAPSLEY, DK
    AMERICAN PSYCHOLOGIST, 1985, 40 (01) : 73 - 83
  • [40] Generation Methodology for Good-Enough Approximate Modules of ATMR
    Hassan, Abdus Sami
    Arifeen, Tooba
    Moradian, Hossein
    Lee, Jeong-A
    JOURNAL OF ELECTRONIC TESTING-THEORY AND APPLICATIONS, 2018, 34 (06): : 651 - 665