An active learning approach for clustering single-cell RNA-seq data

被引:6
|
作者
Lin, Xiang [1 ]
Liu, Haoran [1 ]
Wei, Zhi [1 ]
Roy, Senjuti Basu [1 ]
Gao, Nan [2 ]
机构
[1] New Jersey Inst Technol, Dept Comp Sci, Newark, NJ 07102 USA
[2] Rutgers State Univ, Dept Biol Sci, Newark, NJ USA
关键词
D O I
10.1038/s41374-021-00639-w
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
Single-cell RNA sequencing (scRNA-seq) data has been widely used to profile cellular heterogeneities with a high-resolution picture. Clustering analysis is a crucial step of scRNA-seq data analysis because it provides a chance to identify and uncover undiscovered cell types. Most methods for clustering scRNA-seq data use an unsupervised learning strategy. Since the clustering step is separated from the cell annotation and labeling step, it is not uncommon for a totally exotic clustering with poor biological interpretability to be generated-a result generally undesired by biologists. To solve this problem, we proposed an active learning (AL) framework for clustering scRNA-seq data. The AL model employed a learning algorithm that can actively query biologists for labels, and this manual labeling is expected to be applied to only a subset of cells. To develop an optimal active learning approach, we explored several key parameters of the AL model in the experiments with four real scRNA-seq datasets. We demonstrate that the proposed AL model outperformed state-of-the-art unsupervised clustering methods with less than 1000 labeled cells. Therefore, we conclude that AL model is a promising tool for clustering scRNA-seq data that allows us to achieve a superior performance effectively and efficiently. Active learning (AL) model is a framework designed for single-cell RNA sequence (scRNA-seq) clustering. This model requires that the researchers label a small number of cells selected by a sample selection algorithm. The labeled cells are then used for the supervision of the clustering, to significantly boost the clustering performance of scRNA-seq.
引用
收藏
页码:227 / 235
页数:9
相关论文
共 50 条
  • [1] Deep Learning for Clustering Single-cell RNA-seq Data
    Zhu, Yuan
    Bai, Litai
    Ning, Zilin
    Fu, Wenfei
    Liu, Jie
    Jiang, Linfeng
    Fei, Shihuang
    Gong, Shiyun
    Lu, Lulu
    Deng, Minghua
    Yi, Ming
    CURRENT BIOINFORMATICS, 2024, 19 (03) : 193 - 210
  • [2] A Global Similarity Learning for Clustering of Single-Cell RNA-Seq Data
    Zhu, Xiaoshu
    Guo, Lilu
    Xu, Yunpei
    Li, Hong-Dong
    Liao, Xingyu
    Wu, Fang-Xiang
    Peng, Xiaoqing
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 261 - 266
  • [3] Clustering single-cell RNA-seq data with a model-based deep learning approach
    Tian, Tian
    Wan, Ji
    Song, Qi
    Wei, Zhi
    NATURE MACHINE INTELLIGENCE, 2019, 1 (04) : 191 - 198
  • [4] Clustering single-cell RNA-seq data with a model-based deep learning approach
    Tian Tian
    Ji Wan
    Qi Song
    Zhi Wei
    Nature Machine Intelligence, 2019, 1 : 191 - 198
  • [5] Clustering single-cell RNA-seq data by rank constrained similarity learning
    Mei, Qinglin
    Li, Guojun
    Su, Zhengchang
    BIOINFORMATICS, 2021, 37 (19) : 3235 - 3242
  • [6] Analysis of Single-Cell RNA-seq Data by Clustering Approaches
    Zhu, Xiaoshu
    Li, Hong-Dong
    Guo, Lilu
    Wu, Fang-Xiang
    Wang, Jianxin
    CURRENT BIOINFORMATICS, 2019, 14 (04) : 314 - 322
  • [7] Challenges in unsupervised clustering of single-cell RNA-seq data
    Kiselev, Vladimir Yu
    Andrews, Tallulah S.
    Hemberg, Martin
    NATURE REVIEWS GENETICS, 2019, 20 (05) : 273 - 282
  • [8] Deep single-cell RNA-seq data clustering with graph prototypical contrastive learning
    Lee, Junseok
    Kim, Sungwon
    Hyun, Dongmin
    Lee, Namkyeong
    Kim, Yejin
    Park, Chanyoung
    BIOINFORMATICS, 2023, 39 (06)
  • [9] A deep matrix factorization based approach for single-cell RNA-seq data clustering
    Liang, Zhenlan
    Zheng, Ruiqing
    Chen, Siqi
    Yan, Xuhua
    Li, Min
    METHODS, 2022, 205 : 114 - 122
  • [10] Single-cell RNA-seq data clustering by deep information fusion
    Ren, Liangrui
    Wang, Jun
    Li, Wei
    Guo, Maozu
    Yu, Guoxian
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2024, 23 (02) : 128 - 137