Label-aware distance mitigates temporal and spatial variability for clustering and visualization of single-cell gene expression data

被引:0
|
作者
Shaoheng Liang
Jinzhuang Dou
Ramiz Iqbal
Ken Chen
机构
[1] Department of Bioinformatics and Computational Biology,Department of Computer Science
[2] Rice University,Ray and Stephanie Lane Computational Biology Department, School of Computer Science
[3] Carnegie Mellon University,undefined
来源
Communications Biology | / 7卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Clustering and visualization are essential parts of single-cell gene expression data analysis. The Euclidean distance used in most distance-based methods is not optimal. The batch effect, i.e., the variability among samples gathered from different times, tissues, and patients, introduces large between-group distance and obscures the true identities of cells. To solve this problem, we introduce Label-Aware Distance (Lad), a metric using temporal/spatial locality of the batch effect to control for such factors. We validate Lad on simulated data as well as apply it to a mouse retina development dataset and a lung dataset. We also found the utility of our approach in understanding the progression of the Coronavirus Disease 2019 (COVID-19). Lad provides better cell embedding than state-of-the-art batch correction methods on longitudinal datasets. It can be used in distance-based clustering and visualization methods to combine the power of multiple samples to help make biological findings.
引用
收藏
相关论文
共 50 条
  • [21] scSCC: A swapped contrastive learning-based clustering method for single-cell gene expression data
    Wang, Xiang
    Yang, Sansheng
    Li, Hongwei
    QUANTITATIVE BIOLOGY, 2025, 13 (02)
  • [22] AllenDigger, a Tool for Spatial Expression Data Visualization, Spatial Heterogeneity Delineation, and Single-Cell Registration Based on the Allen Brain Atlas
    Wu, Qian
    Zhuo, Yan
    Wang, Xiaoqun
    Wang, Mengdi
    Zhuo, Liangchen
    Ma, Wenji
    JOURNAL OF PHYSICAL CHEMISTRY A, 2023, 127 (12): : 2864 - 2872
  • [23] Clustering and visualization of single-cell RNA-seq data using path metrics
    Manousidaki, Andriana
    Little, Anna
    Xie, Yuying
    PLOS COMPUTATIONAL BIOLOGY, 2024, 20 (05)
  • [24] Hybrid Clustering of Single-Cell Gene Expression and Spatial Information via Integrated NMF and K-Means
    Oh, Sooyoun
    Park, Haesun
    Zhang, Xiuwei
    FRONTIERS IN GENETICS, 2021, 12
  • [25] Single-Cell Resolution of Temporal Gene Expression during Heart Development
    DeLaughter, Daniel M.
    Bick, Alexander G.
    Wakimoto, Hiroko
    McKean, David
    Gorham, Joshua M.
    Kathiriya, Irian S.
    Hinson, John T.
    Homsy, Jason
    Gray, Jesse
    Pu, William
    Bruneau, Benoit G.
    Seidman, J. G.
    Seidman, Christine E.
    DEVELOPMENTAL CELL, 2016, 39 (04) : 480 - 490
  • [26] Assessing single-cell transcriptomic variability through density-preserving data visualization
    Ashwin Narayan
    Bonnie Berger
    Hyunghoon Cho
    Nature Biotechnology, 2021, 39 : 765 - 774
  • [27] Assessing single-cell transcriptomic variability through density-preserving data visualization
    Narayan, Ashwin
    Berger, Bonnie
    Cho, Hyunghoon
    NATURE BIOTECHNOLOGY, 2021, 39 (06) : 765 - +
  • [28] scHiGex: predicting single-cell gene expression based on single-cell Hi-C data
    Shrestha, Bishal
    Siciliano, Andrew Jordan
    Zhu, Hao
    Liu, Tong
    Wang, Zheng
    NAR GENOMICS AND BIOINFORMATICS, 2025, 7 (01)
  • [29] Clustering and visualization approaches for human cell cycle gene expression data analysis
    Napolitano, F.
    Ralconi, G.
    Tagliaferri, R.
    Ciaramella, A.
    Staiano, A.
    Miele, G.
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2008, 47 (01) : 70 - 84
  • [30] Selecting gene features for unsupervised analysis of single-cell gene expression data
    Sheng, Jie
    Li, Wei Vivian
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)