Generalized Cell Type Annotation and Discovery for Single-Cell RNA-Seq Data
被引:0
|
作者:
Zhai, Yuyao
论文数: 0引用数: 0
h-index: 0
机构:
Peking Univ, Sch Math Sci, Beijing, Peoples R ChinaPeking Univ, Sch Math Sci, Beijing, Peoples R China
Zhai, Yuyao
[1
]
Chen, Liang
论文数: 0引用数: 0
h-index: 0
机构:
Huawei Technol Co Ltd, Shenzhen, Peoples R ChinaPeking Univ, Sch Math Sci, Beijing, Peoples R China
Chen, Liang
[4
]
Deng, Minghua
论文数: 0引用数: 0
h-index: 0
机构:
Peking Univ, Sch Math Sci, Beijing, Peoples R China
Peking Univ, Ctr Stat Sci, Beijing, Peoples R China
Peking Univ, Ctr Quantitat Biol, Beijing, Peoples R ChinaPeking Univ, Sch Math Sci, Beijing, Peoples R China
Deng, Minghua
[1
,2
,3
]
机构:
[1] Peking Univ, Sch Math Sci, Beijing, Peoples R China
[2] Peking Univ, Ctr Stat Sci, Beijing, Peoples R China
[3] Peking Univ, Ctr Quantitat Biol, Beijing, Peoples R China
[4] Huawei Technol Co Ltd, Shenzhen, Peoples R China
The rapid development of single-cell RNA sequencing (scRNA-seq) technology allows us to study gene expression heterogeneity at the cellular level. Cell annotation is the basis for subsequent downstream analysis in single-cell data mining. Existing methods rarely explore the fine-grained semantic knowledge of novel cell types absent from the reference data and usually susceptible to batch effects on the classification of seen cell types. Taking into consideration these limitations, this paper proposes a new and practical task called generalized cell type annotation and discovery for scRNA-seq data. In this task, cells of seen cell types are given class labels, while cells of novel cell types are given cluster labels instead of a unified "unassigned" label. To address this problem, we carefully design a comprehensive evaluation benchmark and propose a novel end-to-end algorithm framework called scGAD. Specifically, scGAD first builds the intrinsic correspondence across the reference and target data by retrieving the geometrically and semantically mutual nearest neighbors as anchor pairs. Then we introduce an anchor-based self-supervised learning module with a connectivity-aware attention mechanism to facilitate model prediction capability on unlabeled target data. To enhance the inter-type separation and intra-type compactness, we further propose a confidential prototypical self-supervised learning module to uncover the consensus category structure of the reference and target data. Extensive results on massive real datasets demonstrate the superiority of scGAD over various state-of-the-art clustering and annotation methods.
机构:
Univ Oulu, Fac Biochem & Mol Med, Dis Networks Unit, Oulu, FinlandUniv Oulu, Fac Biochem & Mol Med, Dis Networks Unit, Oulu, Finland
Dini, Alice
论文数: 引用数:
h-index:
机构:
Barker, Harlan
Piki, Emilia
论文数: 0引用数: 0
h-index: 0
机构:
Univ Oulu, Fac Biochem & Mol Med, Dis Networks Unit, Oulu, FinlandUniv Oulu, Fac Biochem & Mol Med, Dis Networks Unit, Oulu, Finland
Piki, Emilia
Sharma, Subodh
论文数: 0引用数: 0
h-index: 0
机构:
Univ Oulu, Fac Biochem & Mol Med, Dis Networks Unit, Oulu, FinlandUniv Oulu, Fac Biochem & Mol Med, Dis Networks Unit, Oulu, Finland
Sharma, Subodh
Raivola, Juuli
论文数: 0引用数: 0
h-index: 0
机构:
Univ Helsinki, Fac Med, Res Program Unit, Appl Tumor Genom, Helsinki, FinlandUniv Oulu, Fac Biochem & Mol Med, Dis Networks Unit, Oulu, Finland
Raivola, Juuli
Murumagi, Astrid
论文数: 0引用数: 0
h-index: 0
机构:
Univ Helsinki, Inst Mol Med Finland FIMM, Helsinki Inst Life Sci HiLIFE, Helsinki, FinlandUniv Oulu, Fac Biochem & Mol Med, Dis Networks Unit, Oulu, Finland
Murumagi, Astrid
Ungureanu, Daniela
论文数: 0引用数: 0
h-index: 0
机构:
Univ Oulu, Fac Biochem & Mol Med, Dis Networks Unit, Oulu, Finland
Univ Helsinki, Fac Med, Res Program Unit, Appl Tumor Genom, Helsinki, FinlandUniv Oulu, Fac Biochem & Mol Med, Dis Networks Unit, Oulu, Finland
机构:
Univ Wisconsin, Dept Stat, Madison, WI 53706 USAUniv Wisconsin, Dept Stat, Madison, WI 53706 USA
Bacher, Rhonda
Chu, Li-Fang
论文数: 0引用数: 0
h-index: 0
机构:
Morgridge Inst Res, Madison, WI USAUniv Wisconsin, Dept Stat, Madison, WI 53706 USA
Chu, Li-Fang
Leng, Ning
论文数: 0引用数: 0
h-index: 0
机构:
Morgridge Inst Res, Madison, WI USAUniv Wisconsin, Dept Stat, Madison, WI 53706 USA
Leng, Ning
Gasch, Audrey P.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Wisconsin, Lab Genet, Madison, WI USAUniv Wisconsin, Dept Stat, Madison, WI 53706 USA
Gasch, Audrey P.
Thomson, James A.
论文数: 0引用数: 0
h-index: 0
机构:
Morgridge Inst Res, Madison, WI USAUniv Wisconsin, Dept Stat, Madison, WI 53706 USA
Thomson, James A.
Stewart, Ron M.
论文数: 0引用数: 0
h-index: 0
机构:
Morgridge Inst Res, Madison, WI USAUniv Wisconsin, Dept Stat, Madison, WI 53706 USA
Stewart, Ron M.
Newton, Michael
论文数: 0引用数: 0
h-index: 0
机构:
Univ Wisconsin, Dept Stat, Madison, WI 53706 USA
Univ Wisconsin, Dept Biostat & Med Informat, Madison, WI 53706 USAUniv Wisconsin, Dept Stat, Madison, WI 53706 USA
Newton, Michael
Kendziorski, Christina
论文数: 0引用数: 0
h-index: 0
机构:
Univ Wisconsin, Dept Biostat & Med Informat, Madison, WI 53706 USAUniv Wisconsin, Dept Stat, Madison, WI 53706 USA
机构:
Georgia Inst Technol, Wallace H Coulter Dept Biomed Engn, Atlanta, GA 30332 USA
Emory Univ, Atlanta, GA 30322 USAGeorgia Inst Technol, Wallace H Coulter Dept Biomed Engn, Atlanta, GA 30332 USA