Clustering by measuring local direction centrality for data with heterogeneous density and weak connectivity

被引:39
|
作者
Peng, Dehua [1 ,2 ,3 ,4 ]
Gui, Zhipeng [2 ,3 ,4 ]
Wang, Dehe [5 ,6 ]
Ma, Yuncheng [2 ,3 ]
Huang, Zichen [2 ,3 ]
Zhou, Yu [5 ,6 ]
Wu, Huayi [1 ,3 ,4 ]
机构
[1] Wuhan Univ, State Key Lab Informat Engn Surveying Mapping & R, Wuhan, Peoples R China
[2] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan, Peoples R China
[3] Wuhan Univ, Collaborat Innovat Ctr Geospatial Technol, Wuhan, Peoples R China
[4] Hubei Luojia Lab, Wuhan, Peoples R China
[5] Wuhan Univ, Coll Life Sci, Modern Virol Res Ctr, State Key Lab Virol, Wuhan, Peoples R China
[6] Wuhan Univ, Frontier Sci Ctr Immunol & Metab, Wuhan, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
TRANSCRIPTOMIC CELL-TYPES; RNA-SEQ; FLOW; IDENTIFICATION; SPACE; ALGORITHM; EFFICIENT; CRITERIA; TOOL;
D O I
10.1038/s41467-022-33136-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Clustering is a powerful machine learning method for discovering similar patterns according to the proximity of elements in feature space. Here the authors propose a local direction centrality clustering algorithm that copes with heterogeneous density and weak connectivity issues. Clustering is a powerful machine learning method for discovering similar patterns according to the proximity of elements in feature space. It is widely used in computer science, bioscience, geoscience, and economics. Although the state-of-the-art partition-based and connectivity-based clustering methods have been developed, weak connectivity and heterogeneous density in data impede their effectiveness. In this work, we propose a boundary-seeking Clustering algorithm using the local Direction Centrality (CDC). It adopts a density-independent metric based on the distribution of K-nearest neighbors (KNNs) to distinguish between internal and boundary points. The boundary points generate enclosed cages to bind the connections of internal points, thereby preventing cross-cluster connections and separating weakly-connected clusters. We demonstrate the validity of CDC by detecting complex structured clusters in challenging synthetic datasets, identifying cell types from single-cell RNA sequencing (scRNA-seq) and mass cytometry (CyTOF) data, recognizing speakers on voice corpuses, and testifying on various types of real-world benchmarks.
引用
收藏
页数:14
相关论文
共 24 条
  • [1] Clustering by measuring local direction centrality for data with heterogeneous density and weak connectivity
    Dehua Peng
    Zhipeng Gui
    Dehe Wang
    Yuncheng Ma
    Zichen Huang
    Yu Zhou
    Huayi Wu
    Nature Communications, 13
  • [2] Clustering Networks' Heterogeneous Data in Defining a Comprehensive Closeness Centrality Index
    Barzinpour, Farnaz
    Ali-Ahmadi, B. Hoda
    Alizadeh, Somayeh
    Naini, S. Golamreza Jalali
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
  • [3] Density peak clustering by local centers and improved connectivity kernel
    Guo, Wenjie
    Chen, Wei
    Liu, Xinggao
    INFORMATION SCIENCES, 2024, 666
  • [4] Local Connectivity-Based Density Estimation for Face Clustering
    Shin, Junho
    Lee, Hyo-Jun
    Kim, Hyunseop
    Baek, Jong-Hyeon
    Kim, Daehyun
    Koh, Yeong Jun
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 13621 - 13629
  • [5] Measuring Similarity of Complex and Heterogeneous Data in Clustering of Large Data Sets
    Bacelar-Nicolau, Helena
    Nicolau, Fernando
    Sousa, Aurga
    Bacelar-Nicolau, Leonor
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2009, 29 (02) : 9 - 18
  • [6] Adaptive Local Data Density for Clustering Analysis
    Liao, Yalu
    Wang, Yaru
    Yue, Shihong
    2018 13TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2018, : 1629 - 1634
  • [7] Efficient Data Clustering by Local Density Approximation
    Akodjenou, Marc-Ismael
    Gallinari, Patrick
    ECAI 2008, PROCEEDINGS, 2008, 178 : 767 - 768
  • [8] ConDPC: Data Connectivity-Based Density Peak Clustering
    Zou, Yujuan
    Wang, Zhijian
    APPLIED SCIENCES-BASEL, 2022, 12 (24):
  • [9] Clustering Algorithm with Local Direction Centrality Measurement for Agricultural Machinery Trajectory Field-Road Classification
    Luo, Tianchangxiao
    Zhai, Weixin
    Computer Engineering and Applications, 2024, 60 (23) : 303 - 313
  • [10] An optimized denoising method for ICESat-2 photon-counting data considering heterogeneous density and weak connectivity
    Huang, Guoan
    Dong, Zhipeng
    Liu, Yanxiong
    Chen, Yilan
    Li, Jie
    Wang, Yanhong
    Meng, Wenjun
    OPTICS EXPRESS, 2023, 31 (25) : 41496 - 41517