Geometric algorithms for density-based data clustering

被引:0
|
作者
Chen, DZ
Smid, M
Xu, B [1 ]
机构
[1] Univ Notre Dame, Dept Comp Sci & Engn, Notre Dame, IN 46556 USA
[2] Carleton Univ, Sch Comp Sci, Ottawa, ON K1S 5B6, Canada
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present new geometric approximation and exact algorithms for the density-based data clustering problem in d-dimensional space R-d (for any constant integer d greater than or equal to 2). Previously known algorithms for this problem are efficient only for uniformly-distributed points. However, these algorithms all run in theta(n(2)) time in the worst case, where n is the number of input points. Our approximation algorithm based on the e-fuzzy distance function takes 0(n log n) time for any given fixed value epsilon > 0, and our exact algorithms take sub-quadratic time. The running times and output quality of our algorithms do not depend on any particular data distribution. We believe that our fast approximation algorithm is of considerable practical importance, while our sub-quadratic exact algorithms are more of theoretical interest. We implemented our approximation algorithm and the experimental results show that our approximation algorithm is efficient on arbitrary input point sets.
引用
收藏
页码:284 / 296
页数:13
相关论文
共 50 条
  • [21] Anytime density-based clustering of complex data
    Mai, Son T.
    He, Xiao
    Feng, Jing
    Plant, Claudia
    Boehm, Christian
    KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 45 (02) : 319 - 355
  • [22] Density-based hierarchical clustering for streaming data
    Tu, Q.
    Lu, J. F.
    Yuan, B.
    Tang, J. B.
    Yang, J. Y.
    PATTERN RECOGNITION LETTERS, 2012, 33 (05) : 641 - 645
  • [23] Hierarchical density-based clustering of uncertain data
    Kriegel, HP
    Pfeifle, M
    Fifth IEEE International Conference on Data Mining, Proceedings, 2005, : 689 - 692
  • [24] Density-based clustering for exploration of analytical data
    M. Daszykowski
    B. Walczak
    D. L. Massart
    Analytical and Bioanalytical Chemistry, 2004, 380 : 370 - 372
  • [25] Density-based Algorithms for Big Data Clustering Using MapReduce Framework: A Comprehensive Study
    Khader, Mariam
    Al-Naymat, Ghazi
    ACM COMPUTING SURVEYS, 2020, 53 (05)
  • [26] On the Use of Density-Based Algorithms for the Analysis of Solute Clustering in Atom Probe Tomography Data
    Marquis, Emmanuelle A.
    Araullo-Peters, Vicente
    Dong, Yan
    Etienne, Auriane
    Fedotova, Svetlana
    Fujii, Katsuhiko
    Fukuya, Koji
    Kuleshova, Evgenia
    Lopez, Anabelle
    London, Andrew
    Lozano-Perez, Sergio
    Nagai, Yasuyoshi
    Nishida, Kenji
    Radiguet, Bertrand
    Schreiber, Daniel
    Soneda, Naoki
    Thuvander, Mattias
    Toyama, Takeshi
    Sefta, Faiza
    Chou, Peter
    PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON ENVIRONMENTAL DEGRADATION OF MATERIALS IN NUCLEAR POWER SYSTEMS - WATER REACTORS, VOL 2, 2018, : 881 - 897
  • [27] Density-based clustering
    Campello, Ricardo J. G. B.
    Kroeger, Peer
    Sander, Jorg
    Zimek, Arthur
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 10 (02)
  • [28] Density-based clustering
    Kriegel, Hans-Peter
    Kroeger, Peer
    Sander, Joerg
    Zimek, Arthur
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 1 (03) : 231 - 240
  • [29] A density-based clustering algorithm for the CYGNO data analysis
    Baracchini, E.
    Benussi, L.
    Bianco, S.
    Capoccia, C.
    Caponero, M.
    Cavoto, G.
    Cortez, A.
    Costa, I. A.
    Di Marco, E.
    D'Imperio, G.
    Dho, G.
    Lacoangeli, F.
    Maccarrone, G.
    Marafini, M.
    Mazzitelli, G.
    Messina, A.
    Nobrega, R. A.
    Orlandi, A.
    Paoletti, E.
    Passamonti, L.
    Petrucci, F.
    Piccolo, D.
    Pierluigi, D.
    Pinci, D.
    Renga, F.
    Rosatelli, F.
    Russo, A.
    Saviano, G.
    Tesauroc, R.
    Tomassini, S.
    JOURNAL OF INSTRUMENTATION, 2020, 15 (12)
  • [30] Efficient layered density-based clustering of categorical data
    Andreopoulos, Bill
    An, Aijun
    Wang, Xiaogang
    Labudde, Dirk
    JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (02) : 365 - 376