SCHNEL: scalable clustering of high dimensional single-cell data

被引:3
|
作者
Abdelaal, Tamim [1 ,2 ]
de Raadt, Paul [2 ]
Lelieveldt, Boudewijn P. F. [1 ,2 ]
Reinders, Marcel J. T. [1 ,2 ,3 ]
Mahfouz, Ahmed [1 ,2 ,3 ]
机构
[1] Delft Univ Technol, Delft Bioinformat Lab, NL-2628 XE Delft, Netherlands
[2] Leiden Univ, Med Ctr, Leiden Computat Biol Ctr, NL-2333 ZC Leiden, Netherlands
[3] Leiden Univ, Med Ctr, Dept Human Genet, NL-2333 ZC Leiden, Netherlands
基金
欧盟地平线“2020”;
关键词
FLOW-CYTOMETRY; MASS CYTOMETRY; POPULATIONS;
D O I
10.1093/bioinformatics/btaa816
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Single cell data measures multiple cellular markers at the single-cell level for thousands to millions of cells. Identification of distinct cell populations is a key step for further biological understanding, usually performed by clustering this data. Dimensionality reduction based clustering tools are either not scalable to large datasets containing millions of cells, or not fully automated requiring an initial manual estimation of the number of clusters. Graph clustering tools provide automated and reliable clustering for single cell data, but suffer heavily from scalability to large datasets. Results: We developed SCHNEL, a scalable, reliable and automated clustering tool for high-dimensional single-cell data. SCHNEL transforms large high-dimensional data to a hierarchy of datasets containing subsets of data points following the original data manifold. The novel approach of SCHNEL combines this hierarchical representation of the data with graph clustering, making graph clustering scalable to millions of cells. Using seven different cytometry datasets, SCHNEL outperformed three popular clustering tools for cytometry data, and was able to produce meaningful clustering results for datasets of 3.5 and 17.2 million cells within workable time frames. In addition, we show that SCHNEL is a general clustering tool by applying it to single-cell RNA sequencing data, as well as a popular machine learning benchmark dataset MNIST.
引用
收藏
页码:I849 / I856
页数:8
相关论文
共 50 条
  • [21] Tools for the analysis of high-dimensional single-cell RNA sequencing data
    Yan Wu
    Kun Zhang
    Nature Reviews Nephrology, 2020, 16 : 408 - 421
  • [22] Diffusion maps for high-dimensional single-cell analysis of differentiation data
    Haghverdi, Laleh
    Buettner, Florian
    Theis, Fabian J.
    BIOINFORMATICS, 2015, 31 (18) : 2989 - 2998
  • [23] SPRING: a kinetic interface for visualizing high dimensional single-cell expression data
    Weinreb, Caleb
    Wolock, Samuel
    Klein, Allon M.
    BIOINFORMATICS, 2018, 34 (07) : 1246 - 1248
  • [24] MixDir: Scalable Bayesian Clustering for High-Dimensional Categorical Data
    Ahlmann-Eltze, Constantin
    Yau, Christopher
    2018 IEEE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2018, : 526 - 539
  • [25] Clustering single-cell multi-omics data with MoClust
    Yuan, Musu
    Chen, Liang
    Deng, Minghua
    BIOINFORMATICS, 2023, 39 (01)
  • [26] Generalized and scalable trajectory inference in single-cell omics data with VIA
    Shobana V. Stassen
    Gwinky G. K. Yip
    Kenneth K. Y. Wong
    Joshua W. K. Ho
    Kevin K. Tsia
    Nature Communications, 12
  • [27] PsiNorm: a scalable normalization for single-cell RNA-seq data
    Borella, Matteo
    Martello, Graziano
    Risso, Davide
    Romualdi, Chiara
    BIOINFORMATICS, 2022, 38 (01) : 164 - 172
  • [28] A fast, scalable and versatile tool for analysis of single-cell omics data
    Kai Zhang
    Nathan R. Zemke
    Ethan J. Armand
    Bing Ren
    Nature Methods, 2024, 21 : 217 - 227
  • [29] A fast, scalable and versatile tool for analysis of single-cell omics data
    Zhang, Kai
    Zemke, Nathan R.
    Armand, Ethan J.
    Ren, Bing
    NATURE METHODS, 2024, 21 (02) : 217 - 227
  • [30] Generalized and scalable trajectory inference in single-cell omics data with VIA
    Stassen, Shobana, V
    Yip, Gwinky G. K.
    Wong, Kenneth K. Y.
    Ho, Joshua W. K.
    Tsia, Kevin K.
    NATURE COMMUNICATIONS, 2021, 12 (01)