AFFINITY: Efficiently Querying Statistical Measures on Time-Series Data

被引：0

作者：

Sathe, Saket ^{[1
]}

Aberer, Karl ^{[1
]}

机构：

[1] EPFL, Zurich, Switzerland

来源：

2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE) | 2013年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Computing statistical measures for large databases of time series is a fundamental primitive for querying and mining time-series data [1]-[6]. This primitive is gaining importance with the increasing number and rapid growth of time series databases. In this paper, we introduce a framework for efficient computation of statistical measures by exploiting the concept of affine relationships. Affine relationships can be used to infer statistical measures for time series, from other related time series, instead of computing them directly; thus, reducing the overall computational cost significantly. The resulting methods exhibit at least one order of magnitude improvement over the best known methods. To the best of our knowledge, this is the first work that presents an unified approach for computing and querying several statistical measures at once. Our approach exploits affine relationships using three key components. First, the AFCLST algorithm clusters the time-series data, such that high-quality affine relationships could be easily found. Second, the SYMEX algorithm uses the clustered time series and efficiently computes the desired affine relationships. Third, the SCAPE index structure produces a many-fold improvement in the performance of processing several statistical queries by seamlessly indexing the affine relationships. Finally, we establish the effectiveness of our approaches by performing comprehensive experimental evaluation on real datasets.

引用

页码：841 / 852

页数：12

共 50 条

[21] STATISTICAL RECONSTRUCTION OF MULTIVARIATE TIME-SERIES
NIEDZWIECKI, M
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1993, 41 (01) : 451 - 457
[22] EFFECTS OF MISSING DATA ON THE STATISTICAL-ANALYSIS OF CLINICAL TIME-SERIES
RANKIN, ED
MARSH, JC
SOCIAL WORK RESEARCH & ABSTRACTS, 1985, 21 (02): : 13 - 16
[23] Querying time series data based on similarity
Rafiei, D
Mendelzon, AO
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2000, 12 (05) : 675 - 693
[24] Time-Series Data Mining
Esling, Philippe
Agon, Carlos
ACM COMPUTING SURVEYS, 2012, 45 (01)
[25] Comparison of similarity measures and clustering methods for time-series medical data mining
Hirano, S
Tsumoto, S
DATA MINING AND KNOWLEDGE DISCOVERY: TOOLS AND TECHNOLOGY V, 2003, 5098 : 219 - 225
[26] Efficiently Querying Vector and Raster Data
Brisaboa, Nieves R.
de Bernardo, Guillermo
Gutierrez, Gilberto
Luaces, Miguel R.
Parama, Jose R.
COMPUTER JOURNAL, 2017, 60 (09): : 1395 - 1413
[27] Analysis of fMRI time-series by entropy measures
Mikolas, Pavol
Vyhnanek, Jan
Skoch, Antonin
Horacek, Jiri
NEUROENDOCRINOLOGY LETTERS, 2012, 33 (05) : 471 - 476
[28] WHEN IS AN AGGREGATE OF A TIME-SERIES EFFICIENTLY FORECAST BY ITS PAST
KOHN, R
JOURNAL OF ECONOMETRICS, 1982, 18 (03) : 337 - 349
[29] NON-FRONTIER MEASURES OF EFFICIENCY, PROGRESS AND REGRESS FOR TIME-SERIES DATA
TULKENS, H
VANDENEECKAUT, P
INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 1995, 39 (1-2) : 83 - 97
[30] A statistical approach for disaggregating mixed-frequency economic time-series data
Chan, WS
Chen, ZG
ADVANCES IN ECONOMETRICS, VOL 13 1998: MESSY DATA-MISSING OBSERVATIONS, OUTLIERS, AND MIXED-FREQUENCY DATA, 1998, 13 : 21 - 45

← 1 2 3 4 5 →