Multiresolution categorical regression for interpretable cell-type annotation

被引:0
|
作者
Molstad, Aaron J. [1 ]
Motwani, Keshav [2 ]
机构
[1] Univ Minnesota, Sch Stat, Minneapolis, MN 55455 USA
[2] Univ Washington, Dept Biostat, Seattle, WA USA
基金
美国国家科学基金会;
关键词
categorical response regression; cell-type annotation; convex optimization multinomial logistic regression; multiresolution learning; single-cell RNA-seq;
D O I
10.1111/biom.13926
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In many categorical response regression applications, the response categories admit a multiresolution structure. That is, subsets of the response categories may naturally be combined into coarser response categories. In such applications, practitioners are often interested in estimating the resolution at which a predictor affects the response category probabilities. In this paper, we propose a method for fitting the multinomial logistic regression model in high dimensions that addresses this problem in a unified and data-driven way. Our method allows practitioners to identify which predictors distinguish between coarse categories but not fine categories, which predictors distinguish between fine categories, and which predictors are irrelevant. For model fitting, we propose a scalable algorithm that can be applied when the coarse categories are defined by either overlapping or nonoverlapping sets of fine categories. Statistical properties of our method reveal that it can take advantage of this multiresolution structure in a way existing estimators cannot. We use our method to model cell-type probabilities as a function of a cell's gene expression profile (i.e., cell-type annotation). Our fitted model provides novel biological insights which may be useful for future automated and manual cell-type annotation methodology.
引用
收藏
页码:3485 / 3496
页数:12
相关论文
共 50 条
  • [21] CALLR: a semi-supervised cell-type annotation method for single-cell RNA sequencing data
    Wei, Ziyang
    Zhang, Shuqin
    BIOINFORMATICS, 2021, 37 : I51 - I58
  • [22] A self-training interpretable cell type annotation framework using specific marker gene
    Chen, Hegang
    Lu, Yuyin
    Rao, Yanghui
    BIOINFORMATICS, 2024, 40 (10)
  • [23] scSwin: a supervised cell-type annotation method for single-cell RNA sequencing data using Swin Transformer
    Zhang, Shichen
    Xiang, Yiwen
    PROCEEDINGS OF 2024 4TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND INTELLIGENT COMPUTING, BIC 2024, 2024, : 479 - 484
  • [24] Human cell-type methylomes
    Wei Li
    Nature Genetics, 2023, 55 : 167 - 167
  • [25] scMRMA: single cell multiresolution marker-based annotation
    Li, Jia
    Sheng, Quanhu
    Shyr, Yu
    Liu, Qi
    NUCLEIC ACIDS RESEARCH, 2022, 50 (02) : E7
  • [26] scPLAN: a hierarchical computational framework for single transcriptomics data annotation, integration and cell-type label refinement
    Guo, Qirui
    Yuan, Musu
    Zhang, Lei
    Deng, Minghua
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (04)
  • [27] Towards integrative annotation of the cell-type specific gene functional and signaling map in vascular endothelial cells
    Gu, Jin
    Li, Shao
    MOLECULAR BIOSYSTEMS, 2012, 8 (08) : 2041 - 2049
  • [28] Cell-type methylomes in the root
    Gehring, Mary
    NATURE PLANTS, 2016, 2 (05)
  • [29] Human cell-type methylomes
    Li, Wei
    NATURE GENETICS, 2023, 55 (02) : 167 - 167
  • [30] stAI: a deep learning-based model for missing gene imputation and cell-type annotation of spatial transcriptomics
    Zou, Guangsheng
    Shen, Qunlun
    Li, Limin
    Zhang, Shuqin
    NUCLEIC ACIDS RESEARCH, 2025, 53 (05)