CASSL: A cell-type annotation method for single cell transcriptomics data using semi-supervised learning

被引:7
|
作者
Seal, Dibyendu Bikash [1 ]
Das, Vivek [2 ]
De, Rajat K. [3 ]
机构
[1] Univ Calcutta, AK Choudhury Sch Informat Technol, JD 2,Sect 3, Kolkata 700106, India
[2] Novo Nordisk AS, Novo Nordisk Pk 1, DK-2760 Malov, Denmark
[3] Indian Stat Inst, Machine Intelligence Unit, 203 Barrackpore Trunk Rd, Kolkata 700108, India
关键词
scRNA-seq; Semi-supervised learning; NMF; k-means; RNA-SEQ DATA; DIMENSIONALITY REDUCTION; IDENTIFICATION; CLASSIFICATION; IMPUTATION; DYNAMICS;
D O I
10.1007/s10489-022-03440-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Single cell RNA sequencing (scRNA-seq) allows global transcriptomic profiling at a cellular resolution, thus, identifying underlying cell types and corresponding lineages. Such cell type identification and annotation rely heavily on models that learn by training themselves on a large amount of individual cells with accurate, annotated labels. Presently, this task of cell-type annotation is done based on inspection of marker genes from each of the statistically significant groups of cells. This is both challenging and time consuming. In this article, we have proposed a semi-supervised cell-type annotation method, called CASSL, based on Non-negative matrix factorization (NMF) coupled with recursive k-means algorithm. A semi-supervised model is capable of learning labels for a large amount of unlabelled data with the help of a limited amount of labelled data. The effectiveness of CASSL has been demonstrated on eight publicly available human and mice scRNA-seq datasets across varied organs and protocols. It has been able to correctly annotate majority of the unlabelled cells with high accuracy. It has also been evaluated for its correctness of clustering solution, robustness across varying percentage of missing labels, and time taken for execution. When compared with state-of-the-art unsupervised and semi-supervised cell-type annotation methods, CASSL has consistently outperformed others across all metrics for most of the datasets. It has also shown competitive results when compared against state-of-the-art supervised methods.
引用
收藏
页码:1287 / 1305
页数:19
相关论文
共 50 条
  • [31] SEMI-SUPERVISED LEARNING FOR CELL TRACKING IN MICROSCOPY IMAGES
    Ramesh, Nisha
    Tasdizen, Tolga
    2018 IEEE 15TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2018), 2018, : 948 - 951
  • [32] scAnnotate: an automated cell-type annotation tool for single-cell RNA-sequencing data
    Ji, Xiangling
    Tsao, Danielle
    Bai, Kailun
    Tsao, Min
    Xing, Li
    Zhang, Xuekui
    BIOINFORMATICS ADVANCES, 2023, 3 (01):
  • [33] scSSA: A clustering method for single cell RNA-seq data based on semi-supervised autoencoder
    Zhao, Jian-Ping
    Hou, Tong-Shuai
    Su, Yansen
    Zheng, Chun-Hou
    METHODS, 2022, 208 : 66 - 74
  • [34] MACA: marker-based automatic cell-type annotation for single-cell expression data
    Xu, Yang
    Baumgart, Simon J.
    Stegmann, Christian M.
    Hayat, Sikander
    BIOINFORMATICS, 2022, 38 (06) : 1756 - 1760
  • [35] Single-cell RNA-seq data semi-supervised clustering and annotation via structural regularized domain adaptation
    Chen, Liang
    He, Qiuyan
    Zhai, Yuyao
    Deng, Minghua
    BIOINFORMATICS, 2021, 37 (06) : 775 - 784
  • [36] SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics
    Wang, Xinjun
    Xu, Zhongli
    Hu, Haoran
    Zhou, Xueping
    Zhang, Yanfu
    Lafyatis, Robert
    Chen, Kong
    Huang, Heng
    Ding, Ying
    Duerr, Richard H.
    Chen, Wei
    PNAS NEXUS, 2022, 1 (04):
  • [37] Clinical, cell-type, and architectural characterization of AFX and PDS using single-cell and spatial transcriptomics
    Klein, J. C.
    Hosler, G.
    Hon, G.
    JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2024, 144 (08) : S144 - S144
  • [38] CELL-TYPE SPECIFIC INSIGHTS INTO THE AKIMBA MODEL OF DIABETIC RETINOPATHY USING SINGLE CELL TRANSCRIPTOMICS ANALYSIS
    Porcu, M.
    Van Hove, I.
    Hu, T.
    Beets, K.
    De Groef, L.
    Boeckx, B.
    Feyen, J. H. M.
    EUROPEAN JOURNAL OF OPHTHALMOLOGY, 2019, 29 (03) : NP20 - NP20
  • [39] stAI: a deep learning-based model for missing gene imputation and cell-type annotation of spatial transcriptomics
    Zou, Guangsheng
    Shen, Qunlun
    Li, Limin
    Zhang, Shuqin
    NUCLEIC ACIDS RESEARCH, 2025, 53 (05)
  • [40] A Semi-Supervised Learning Method to Remedy the Lack of Labeled Data
    Nhut-Quang Nguyen
    Thanh-Sach Le
    2021 15TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND APPLICATIONS (ACOMP 2021), 2021, : 78 - 84