Generalized Cell Type Annotation and Discovery for Single-Cell RNA-Seq Data

被引:0
|
作者
Zhai, Yuyao [1 ]
Chen, Liang [4 ]
Deng, Minghua [1 ,2 ,3 ]
机构
[1] Peking Univ, Sch Math Sci, Beijing, Peoples R China
[2] Peking Univ, Ctr Stat Sci, Beijing, Peoples R China
[3] Peking Univ, Ctr Quantitat Biol, Beijing, Peoples R China
[4] Huawei Technol Co Ltd, Shenzhen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid development of single-cell RNA sequencing (scRNA-seq) technology allows us to study gene expression heterogeneity at the cellular level. Cell annotation is the basis for subsequent downstream analysis in single-cell data mining. Existing methods rarely explore the fine-grained semantic knowledge of novel cell types absent from the reference data and usually susceptible to batch effects on the classification of seen cell types. Taking into consideration these limitations, this paper proposes a new and practical task called generalized cell type annotation and discovery for scRNA-seq data. In this task, cells of seen cell types are given class labels, while cells of novel cell types are given cluster labels instead of a unified "unassigned" label. To address this problem, we carefully design a comprehensive evaluation benchmark and propose a novel end-to-end algorithm framework called scGAD. Specifically, scGAD first builds the intrinsic correspondence across the reference and target data by retrieving the geometrically and semantically mutual nearest neighbors as anchor pairs. Then we introduce an anchor-based self-supervised learning module with a connectivity-aware attention mechanism to facilitate model prediction capability on unlabeled target data. To enhance the inter-type separation and intra-type compactness, we further propose a confidential prototypical self-supervised learning module to uncover the consensus category structure of the reference and target data. Extensive results on massive real datasets demonstrate the superiority of scGAD over various state-of-the-art clustering and annotation methods.
引用
收藏
页码:5402 / 5410
页数:9
相关论文
共 50 条
  • [21] SCRABBLE: single-cell RNA-seq imputation constrained by bulk RNA-seq data
    Peng, Tao
    Zhu, Qin
    Yin, Penghang
    Tan, Kai
    GENOME BIOLOGY, 2019, 20 (1)
  • [22] Comparison of transformations for single-cell RNA-seq data
    Ahlmann-Eltze, Constantin
    Huber, Wolfgang
    NATURE METHODS, 2023, 20 (05) : 665 - +
  • [23] SCRABBLE: single-cell RNA-seq imputation constrained by bulk RNA-seq data
    Tao Peng
    Qin Zhu
    Penghang Yin
    Kai Tan
    Genome Biology, 20
  • [24] scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data
    Yang, Fan
    Wang, Wenchuan
    Wang, Fang
    Fang, Yuan
    Tang, Duyu
    Huang, Junzhou
    Lu, Hui
    Yao, Jianhua
    NATURE MACHINE INTELLIGENCE, 2022, 4 (10) : 852 - +
  • [25] The contribution of cell cycle to heterogeneity in single-cell RNA-seq data
    McDavid, Andrew
    Finak, Greg
    Gottardo, Raphael
    NATURE BIOTECHNOLOGY, 2016, 34 (06) : 591 - 593
  • [26] The contribution of cell cycle to heterogeneity in single-cell RNA-seq data
    Andrew McDavid
    Greg Finak
    Raphael Gottardo
    Nature Biotechnology, 2016, 34 : 591 - 593
  • [27] An Efficient and Flexible Method for Deconvoluting Bulk RNA-Seq Data with Single-Cell RNA-Seq Data
    Sun, Xifang
    Sun, Shiquan
    Yang, Sheng
    CELLS, 2019, 8 (10)
  • [28] Impact of data preprocessing on cell-type clustering based on single-cell RNA-seq data
    Chunxiang Wang
    Xin Gao
    Juntao Liu
    BMC Bioinformatics, 21
  • [29] Impact of data preprocessing on cell-type clustering based on single-cell RNA-seq data
    Wang, Chunxiang
    Gao, Xin
    Liu, Juntao
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [30] Review of single-cell RNA-seq data clustering for cell-type identification and characterization
    Zhang, Shixiong
    Li, Xiangtao
    Lin, Jiecong
    Lin, Qiuzhen
    Wong, Ka-Chun
    RNA, 2023, 29 (05) : 517 - 530