Density-based multiscale data condensation

被引:86
|
作者
Mitra, P [1 ]
Murthy, CA [1 ]
Pal, SK [1 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700035, W Bengal, India
关键词
data mining; multiscale condensation; scalability; density estimation; convergence in probability; instance learning;
D O I
10.1109/TPAMI.2002.1008381
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A problem gaining interest in pattern recognition applied to data mining is that of selecting a small representative subset from a very large data set. In this article, a nonparametric data reduction scheme is suggested. It attempts to represent the density underlying the data. The algorithm selects representative points in a multiscale fashion which is novel from existing density-based approaches, The accuracy of representation by the condensed set is measured in terms of the error in density estimates of the original and reduced sets. Experimental studies on several real life data sets show that the multiscale approach is superior to several related condensation methods both in terms of condensation ratio and estimation error. The condensed set obtained was also experimentally shown to be effective for some important data mining tasks like classification, clustering, and rule generation on large data sets. Moreover, it is empirically found that the algorithm is efficient in terms of sample complexity.
引用
收藏
页码:734 / 747
页数:14
相关论文
共 50 条
  • [1] Multiscale PMU Data Compression via Density-Based WAMS Clustering Analysis
    Lee, Gyul
    Kim, Do-In
    Kim, Seon Hyeog
    Shin, Yong-June
    ENERGIES, 2019, 12 (04)
  • [2] Novel density-based and hierarchical density-based clustering algorithms for uncertain data
    Zhang, Xianchao
    Liu, Han
    Zhang, Xiaotong
    NEURAL NETWORKS, 2017, 93 : 240 - 255
  • [3] Dislocation Density-Based Multiscale Modeling of Deformation and Subgrain Texture in Polycrystals
    Hamid, Mehdi
    Zbib, Hussein M.
    JOM, 2019, 71 (11) : 4136 - 4143
  • [4] Dislocation Density-Based Multiscale Modeling of Deformation and Subgrain Texture in Polycrystals
    Mehdi Hamid
    Hussein M. Zbib
    JOM, 2019, 71 : 4136 - 4143
  • [5] An Efficient Density-Based Algorithm for Data Clustering
    Theljani, Foued
    Laabidi, Kaouther
    Zidi, Salah
    Ksouri, Moufida
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2017, 26 (04)
  • [6] Anytime density-based clustering of complex data
    Son T. Mai
    Xiao He
    Jing Feng
    Claudia Plant
    Christian Böhm
    Knowledge and Information Systems, 2015, 45 : 319 - 355
  • [7] Geometric algorithms for density-based data clustering
    Chen, DZ
    Smid, M
    Xu, B
    ALGORITHMS-ESA 2002, PROCEEDINGS, 2002, 2461 : 284 - 296
  • [8] Density-based clustering for exploration of analytical data
    Daszykowski, M
    Walczak, B
    Massart, DL
    ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2004, 380 (03) : 370 - 372
  • [9] Share density-based clustering of income data
    Condino, Francesca
    STATISTICAL ANALYSIS AND DATA MINING, 2023, 16 (04) : 336 - 347
  • [10] Geometric algorithms for density-based data clustering
    Chen, DZ
    Smid, M
    Xu, B
    INTERNATIONAL JOURNAL OF COMPUTATIONAL GEOMETRY & APPLICATIONS, 2005, 15 (03) : 239 - 260