CASSL: A cell-type annotation method for single cell transcriptomics data using semi-supervised learning

被引:7
|
作者
Seal, Dibyendu Bikash [1 ]
Das, Vivek [2 ]
De, Rajat K. [3 ]
机构
[1] Univ Calcutta, AK Choudhury Sch Informat Technol, JD 2,Sect 3, Kolkata 700106, India
[2] Novo Nordisk AS, Novo Nordisk Pk 1, DK-2760 Malov, Denmark
[3] Indian Stat Inst, Machine Intelligence Unit, 203 Barrackpore Trunk Rd, Kolkata 700108, India
关键词
scRNA-seq; Semi-supervised learning; NMF; k-means; RNA-SEQ DATA; DIMENSIONALITY REDUCTION; IDENTIFICATION; CLASSIFICATION; IMPUTATION; DYNAMICS;
D O I
10.1007/s10489-022-03440-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Single cell RNA sequencing (scRNA-seq) allows global transcriptomic profiling at a cellular resolution, thus, identifying underlying cell types and corresponding lineages. Such cell type identification and annotation rely heavily on models that learn by training themselves on a large amount of individual cells with accurate, annotated labels. Presently, this task of cell-type annotation is done based on inspection of marker genes from each of the statistically significant groups of cells. This is both challenging and time consuming. In this article, we have proposed a semi-supervised cell-type annotation method, called CASSL, based on Non-negative matrix factorization (NMF) coupled with recursive k-means algorithm. A semi-supervised model is capable of learning labels for a large amount of unlabelled data with the help of a limited amount of labelled data. The effectiveness of CASSL has been demonstrated on eight publicly available human and mice scRNA-seq datasets across varied organs and protocols. It has been able to correctly annotate majority of the unlabelled cells with high accuracy. It has also been evaluated for its correctness of clustering solution, robustness across varying percentage of missing labels, and time taken for execution. When compared with state-of-the-art unsupervised and semi-supervised cell-type annotation methods, CASSL has consistently outperformed others across all metrics for most of the datasets. It has also shown competitive results when compared against state-of-the-art supervised methods.
引用
收藏
页码:1287 / 1305
页数:19
相关论文
共 50 条
  • [1] CASSL: A cell-type annotation method for single cell transcriptomics data using semi-supervised learning
    Dibyendu Bikash Seal
    Vivek Das
    Rajat K. De
    Applied Intelligence, 2023, 53 : 1287 - 1305
  • [2] CALLR: a semi-supervised cell-type annotation method for single-cell RNA sequencing data
    Wei, Ziyang
    Zhang, Shuqin
    BIOINFORMATICS, 2021, 37 : I51 - I58
  • [3] SpaDecon: cell-type deconvolution in spatial transcriptomics with semi-supervised learning
    Kyle Coleman
    Jian Hu
    Amelia Schroeder
    Edward B. Lee
    Mingyao Li
    Communications Biology, 6
  • [4] SpaDecon: cell-type deconvolution in spatial transcriptomics with semi-supervised learning
    Coleman, Kyle
    Hu, Jian
    Schroeder, Amelia
    Lee, Edward B.
    Li, Mingyao
    COMMUNICATIONS BIOLOGY, 2023, 6 (01)
  • [5] A Novel Workflow for Semi-supervised Annotation of Cell-type Clusters in Mass Cytometry Data
    Kaushik, Abhinav
    Dunham, Diane
    Manohar, Monali
    Nadeau, Kari C.
    Andorf, Sandra
    ACM-BCB'19: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, 2019, : 532 - 532
  • [6] Semi-supervised integration of single-cell transcriptomics data
    Andreatta, Massimo
    Herault, Leonard
    Gueguen, Paul
    Gfeller, David
    Berenstein, Ariel J.
    Carmona, Santiago J.
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [7] Semi-supervised integration of single-cell transcriptomics data
    Massimo Andreatta
    Léonard Hérault
    Paul Gueguen
    David Gfeller
    Ariel J. Berenstein
    Santiago J. Carmona
    Nature Communications, 15
  • [8] scSwin: a supervised cell-type annotation method for single-cell RNA sequencing data using Swin Transformer
    Zhang, Shichen
    Xiang, Yiwen
    PROCEEDINGS OF 2024 4TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND INTELLIGENT COMPUTING, BIC 2024, 2024, : 479 - 484
  • [9] scSemiAE: a deep model with semi-supervised learning for single-cell transcriptomics
    Dong, Jiayi
    Zhang, Yin
    Wang, Fei
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [10] scSemiAE: a deep model with semi-supervised learning for single-cell transcriptomics
    Jiayi Dong
    Yin Zhang
    Fei Wang
    BMC Bioinformatics, 23