DDR: an index method for large time-series datasets

被引：14

作者：

An, JY

Chen, YPP

Chen, HX

机构：

[1] Deakin Univ, Sch Informat Technol, Fac Sci & Technol, Melbourne, Vic 3125, Australia

[2] Australia Res Council Ctr Bioinformat, Melbourne, Vic, Australia

[3] Univ Tsukuba, Inst Informat Sci & Elect, Tsukuba, Ibaraki 305, Japan

来源：

INFORMATION SYSTEMS | 2005年 / 30卷 / 05期

关键词：

time series; indexing; dimensionality reduction;

D O I：

10.1016/j.is.2004.05.001

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The tree index structure is a traditional method for searching similar data in large datasets. It is based on the presupposition that most sub-trees are pruned in the searching process. As a result, the number of page accesses is reduced. However, time-series datasets generally have a very high dimensionality. Because of the so-called dimensionality curse, the pruning effectiveness is reduced in high dimensionality. Consequently, the tree index structure is not a suitable method for time-series datasets. In this paper, we propose a two-phase (filtering and refinement) method for searching time-series datasets. In the filtering step, a quantizing time-series is used to construct a compact file which is scanned for filtering out irrelevant. A small set of candidates is translated to the second step for refinement. In this step, we introduce an effective index compression method named grid-based datawise dimensionality reduction (DRR) which attempts to preserve the characteristics of the time-series. An experimental comparison with existing techniques demonstrates the utility of our approach. (c) 2004 Elsevier Ltd. All rights reserved.

引用

页码：333 / 348

页数：16

共 50 条

[1] Identifying Label Noise in Time-Series Datasets
Atkinson, Gentry
Metsis, Vangelis
UBICOMP/ISWC '20 ADJUNCT: PROCEEDINGS OF THE 2020 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING AND PROCEEDINGS OF THE 2020 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS, 2020, : 238 - 243
[2] A Process-Oriented Method for Tracking Rainstorms with a Time-Series of Raster Datasets
Xue, Cunjin
Liu, Jingyi
Yang, Guanghui
Wu, Chengbin
APPLIED SCIENCES-BASEL, 2019, 9 (12):
[3] Clustering of large time series datasets
Aghabozorgi, Saeed
Teh, Ying Wah
INTELLIGENT DATA ANALYSIS, 2014, 18 (05) : 793 - 817
[4] Cluster analysis of long time-series medical datasets
Hirano, S
Tsumoto, S
DATA MINING AND KNOWLEDGE DISCOVERY: THEORY, TOOLS, AND TECHNOLOGY VI, 2004, 5433 : 13 - 20
[5] Forecasting Time-Series Trends by Merging Structured and Unstructured Datasets
Park, Ji Sang
Cho, Hyeon Sung
Lee, Ji Sung
Chung, Kyo-Il
Kim, Jeong Min
Kim, Dong Jin
11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1230 - 1233
[6] INSTABILITY INDEX OF TIME-SERIES DATA - GENERALIZATION
VALLE, PAD
OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 1979, 41 (03) : 247 - 248
[7] Mining single-cell time-series datasets with Time Course Inspector
Dobrzynski, Maciej
Jacques, Marc-Antoine
Pertz, Olivier
BIOINFORMATICS, 2020, 36 (06) : 1968 - 1969
[8] TIME-SERIES SEGMENTATION - A MODEL AND A METHOD
SCLOVE, SL
INFORMATION SCIENCES, 1983, 29 (01) : 7 - 25
[9] ON THE METHOD OF COEFFICIENTS IN CREATION OF TIME-SERIES
MATHE, S
EKONOMICKY CASOPIS, 1989, 37 (11): : 1021 - 1036
[10] A METHOD OF EDITING TIME-SERIES OBSERVATIONS
HALPENNY, J
GEOPHYSICS, 1984, 49 (05) : 521 - 524

← 1 2 3 4 5 →