Anytime density-based clustering of complex data

被引:0
|
作者
Son T. Mai
Xiao He
Jing Feng
Claudia Plant
Christian Böhm
机构
[1] University of Munich,Institute for Informatics
[2] Technische Universität München,Helmholtz Zentrum München
来源
Knowledge and Information Systems | 2015年 / 45卷
关键词
Anytime clustering; Density-based clustering; Lower bounding distance; Fiber segmentation; Fiber clustering; Diffusion tensor imaging;
D O I
暂无
中图分类号
学科分类号
摘要
Many clustering algorithms suffer from scalability problems on massive datasets and do not support any user interaction during runtime. To tackle these problems, anytime clustering algorithms are proposed. They produce a fast approximate result which is continuously refined during the further run. Also, they can be stopped or suspended anytime to provide an intermediate answer. In this paper, we propose a novel anytime clustering algorithm modeled on the density-based clustering paradigm. Our algorithm called A-DBSCAN is applicable to many complex data such as trajectory and medical data. The general idea of our algorithm is to use a sequence of lower bounding functions (LBs) of the true distance function to produce multiple approximate results of the true density-based clusters. A-DBSCAN operates in multiple levels w.r.t. the LBs and is mainly based on two algorithmic schemes: (1) an efficient distance upgrade scheme which restricts distance calculations to core objects at each level of the LBs and (2) a local reclustering scheme which restricts update operations to the relevant objects only. To further improve the performance, we propose a significant extension version of A-DBSCAN called A-DBSCAN-XS which is built upon the anytime scheme of A-DBSCAN and the μ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu $$\end{document}-range query scheme of a data structure called extended Xseedlist. A-DBSCAN-XS requires less distance calculations at each level than A-DBSCAN and thus is more efficient. Extensive experiments demonstrate that A-DBSCAN and A-DBSCAN-XS acquire very good clustering results at very early stages of execution and thus save a large amount of computational time. Even if they run to the end, A-DBSCAN and A-DBSCAN-XS are still orders of magnitude faster than the original algorithm DBSCAN and its variants. We also introduce a novel application for our algorithms for the segmentation of the white matter fiber tracts in human brain which is an important tool for studying the brain structure and various diseases such as Alzheimer.
引用
收藏
页码:319 / 355
页数:36
相关论文
共 50 条
  • [21] Efficient layered density-based clustering of categorical data
    Andreopoulos, Bill
    An, Aijun
    Wang, Xiaogang
    Labudde, Dirk
    JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (02) : 365 - 376
  • [22] A density-based clustering algorithm for the CYGNO data analysis
    Baracchini, E.
    Benussi, L.
    Bianco, S.
    Capoccia, C.
    Caponero, M.
    Cavoto, G.
    Cortez, A.
    Costa, I. A.
    Di Marco, E.
    D'Imperio, G.
    Dho, G.
    Lacoangeli, F.
    Maccarrone, G.
    Marafini, M.
    Mazzitelli, G.
    Messina, A.
    Nobrega, R. A.
    Orlandi, A.
    Paoletti, E.
    Passamonti, L.
    Petrucci, F.
    Piccolo, D.
    Pierluigi, D.
    Pinci, D.
    Renga, F.
    Rosatelli, F.
    Russo, A.
    Saviano, G.
    Tesauroc, R.
    Tomassini, S.
    JOURNAL OF INSTRUMENTATION, 2020, 15 (12)
  • [23] Density-Based Clustering of Data Streams at Multiple Resolutions
    Wan, Li
    Ng, Wee Keong
    Dang, Xuan Hong
    Yu, Philip S.
    Zhang, Kuan
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2009, 3 (03)
  • [24] Density-based clustering on massive mobile communication data
    Liu, YF
    Tang, SW
    Yang, DQ
    Chen, Y
    Wang, TJ
    Ma, S
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XI, PROCEEDINGS: COMMUNICATION, NETWORK AND CONTROL SYSTEMS, TECHNOLOGIES AND APPLICATIONS: II, 2003, : 251 - 254
  • [25] On Density-Based Data Streams Clustering Algorithms: A Survey
    Amineh Amini
    Teh Ying Wah
    Hadi Saboohi
    Journal of Computer Science and Technology, 2014, 29 : 116 - 141
  • [26] Hierarchical density-based clustering of categorical data and a simplification
    Andreopoulos, Bill
    An, Aijun
    Wang, Xiaogang
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 11 - +
  • [27] Effective Density-Based Clustering Algorithms for Incomplete Data
    Zhonghao Xue
    Hongzhi Wang
    Big Data Mining and Analytics, 2021, 4 (03) : 183 - 194
  • [28] Density-based clustering for bivariate-flow data
    Shu, Hua
    Pei, Tao
    Song, Ci
    Chen, Jie
    Chen, Xiao
    Guo, Sihui
    Liu, Yaxi
    Wang, Xi
    Wang, Xuyang
    Zhou, Chenghu
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2022, 36 (09) : 1809 - 1829
  • [29] Density-based clustering for evolving uncertain data stream
    He, Haitao
    Zhao, Jintian
    Journal of Computational Information Systems, 2014, 10 (01): : 419 - 426
  • [30] On Density-Based Data Streams Clustering Algorithms: A Survey
    Amineh Amini
    Teh Ying Wah
    Hadi Saboohi
    Journal of Computer Science & Technology, 2014, 29 (01) : 116 - 141