Flexible density peak clustering for real-world data

被引:2
|
作者
Hou, Jian [1 ]
Lin, Houshen [1 ]
Yuan, Huaqiang [1 ]
Pelillo, Marcello [2 ,3 ]
机构
[1] Dongguan Univ Technol, Sch Comp Sci & Technol, Dongguan 523808, Peoples R China
[2] Ca Foscari Univ, DAIS, I-30172 Venice, Italy
[3] Ca Foscari Univ, European Ctr Living Technol, I-30123 Venice, Italy
基金
中国国家自然科学基金;
关键词
Clustering; Density peak; Real-world data; Number of clusters; FAST SEARCH; K-MEANS; FIND;
D O I
10.1016/j.patcog.2024.110772
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In density based clustering, the density peak algorithm has attracted much attention due to its effectiveness and simplicity, and a vast amount of clustering approaches have been proposed based on this algorithm. Some of these works require manual selection of cluster centers with a decision graph, where human involvement leads to uncertainty in clustering results. In order to avoid human involvement, some other algorithms depend on user-specified number of clusters to determine cluster centers automatically. However, it is well known that accurate estimation of number of clusters is a long-standing difficulty in data clustering. In this paper we present a sequential density peak clustering algorithm to extract clusters one by one, thereby determining the number of clusters automatically and avoiding manual selection of cluster centers in the meanwhile. Starting from a density peak, our algorithm generates an initial cluster surrounding the density peak in the first step, and then obtains the final cluster by expanding the initial cluster based on the relative density relationship among neighboring data points. With a peeling-off strategy, we obtain all the clusters sequentially. Our algorithm works well with clusters of Gaussian distribution and is therefore potential for clustering of real-world data. Experiments with a large number of synthetic and real datasets and comparisons with existing algorithms demonstrate the effectiveness of the proposed algorithm.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Deriving Real-World Insights From Real-World Data: Biostatistics to the Rescue
    Pencina, Michael J.
    Rockhold, Frank W.
    D'Agostino, Ralph B., Sr.
    ANNALS OF INTERNAL MEDICINE, 2018, 169 (06) : 401 - +
  • [42] Real-world data in the molecular erafinding the reality in the real world
    Dickson, D. J.
    Pfeifer, J. D.
    CLINICAL PHARMACOLOGY & THERAPEUTICS, 2016, 99 (02) : 186 - 197
  • [43] Clustering Users Based on Hearing Aid Use: An Exploratory Analysis of Real-World Data
    Pasta, Alessandro
    Szatmari, Tiberiu-Ioan
    Christensen, Jeppe Hoy
    Jensen, Kasper Juul
    Pontoppidan, Niels Henrik
    Sun, Kang
    Larsen, Jakob Eg
    FRONTIERS IN DIGITAL HEALTH, 2021, 3
  • [44] Developing real-world evidence from real-world data: Transforming raw data into analytical datasets
    Bastarache, Lisa
    Brown, Jeffrey S.
    Cimino, James J.
    Dorr, David A.
    Embi, Peter J.
    Payne, Philip R. O.
    Wilcox, Adam B.
    Weiner, Mark G.
    LEARNING HEALTH SYSTEMS, 2022, 6 (01):
  • [45] Real-World Pitfalls of Analyzing Real-World Data: A Cautionary Note and Path Forward
    Cooper, John D.
    Shou, Karen
    Sunderland, Kevin
    Pham, Kevin
    Thornton, Jennifer A.
    Destefano, Christin B.
    JCO CLINICAL CANCER INFORMATICS, 2023, 7
  • [46] Real-World Evidence: A Review of Real-World Data Sources Used in Orthopaedic Research
    Hak, David J.
    Mackowiak, John I.
    Irwin, Debra E.
    Aldridge, Molly L.
    Mack, Christina D.
    JOURNAL OF ORTHOPAEDIC TRAUMA, 2021, 35 : S6 - S12
  • [47] Real-World Pitfalls of Analyzing Real-World Data: A Cautionary Note and Path Forward
    Cooper, John D.
    Shou, Karen
    Sunderland, Kevin
    Pham, Kevin
    Thornton, Jennifer A.
    DeStefano, Christin B.
    JCO CLINICAL CANCER INFORMATICS, 2023, 7 : e2300097
  • [48] A DIAGNOSTIC FRAMEWORK TO EVALUATE REAL-WORLD DATA SOURCES FOR REAL-WORLD EVIDENCE GENERATION
    Denysyk, L.
    Doyle, J.
    Sood, R.
    VALUE IN HEALTH, 2018, 21 : S89 - S89
  • [49] Real-World Data and Real-World Evidence in Healthcare in the United States and Europe Union
    Zou, Kelly H.
    Berger, Marc L.
    BIOENGINEERING-BASEL, 2024, 11 (08):
  • [50] ANALYSIS OF REAL-WORLD EVIDENCE AND REAL-WORLD DATA BY CONITEC, BRAZILIAN HTA AGENCY
    Nita, M. E.
    Riveros, B. S.
    Vaz, P.
    Mussolino, F.
    VALUE IN HEALTH, 2016, 19 (03) : A286 - A286