Boosting scRNA-seq data clustering by cluster-aware feature weighting

被引:3
|
作者
Li, Rui-Yi [1 ]
Guan, Jihong [1 ]
Zhou, Shuigeng [2 ,3 ]
机构
[1] Tongji Univ, Dept Comp Sci & Technol, 4800 Caoan Rd, Shanghai 201804, Peoples R China
[2] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, 220 Handan Rd, Shanghai 200433, Peoples R China
[3] Fudan Univ, Sch Comp Sci, 220 Handan Rd, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金;
关键词
Single cell RNA sequencing; Feature weighting; feature selection; Clustering; MESSENGER-RNA-SEQ; CELL-TYPES; SINGLE; HETEROGENEITY; CLASSIFICATION; RECONSTRUCTION;
D O I
10.1186/s12859-021-04033-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background The rapid development of single-cell RNA sequencing (scRNA-seq) enables the exploration of cell heterogeneity, which is usually done by scRNA-seq data clustering. The essence of scRNA-seq data clustering is to group cells by measuring the similarities among genes/transcripts of cells. And the selection of features for cell similarity evaluation is of great importance, which will significantly impact clustering effectiveness and efficiency. Results In this paper, we propose a novel method called CaFew to select genes based on cluster-aware feature weighting. By optimizing the clustering objective function, CaFew obtains a feature weight matrix, which is further used for feature selection. The genes have large weights in at least one cluster or the genes whose weights vary greatly in different clusters are selected. Experiments on 8 real scRNA-seq datasets show that CaFew can obviously improve the clustering performance of existing scRNA-seq data clustering methods. Particularly, the combination of CaFew with SC3 achieves the state-of-art performance. Furthermore, CaFew also benefits the visualization of scRNA-seq data. Conclusion CaFew is an effective scRNA-seq data clustering method due to its gene selection mechanism based on cluster-aware feature weighting, and it is a useful tool for scRNA-seq data analysis.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Boosting scRNA-seq data clustering by cluster-aware feature weighting
    Rui-Yi Li
    Jihong Guan
    Shuigeng Zhou
    BMC Bioinformatics, 22
  • [2] scSFCL:Deep clustering of scRNA-seq data with subspace feature confidence learning
    Meng, Xiaokun
    Zhang, Yuanyuan
    Xu, Xiaoyu
    Zhang, Kaihao
    Feng, Baoming
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2025, 114
  • [3] Recursive Clustering of Cellular Diversity in scRNA-Seq Data
    Squires, Michael
    Qiu, Peng
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2025,
  • [4] A framework for scRNA-seq data clustering based on multi-view feature integration
    Li, Feng
    Liu, Yang
    Liu, Jinxing
    Ge, Daohui
    Shang, Junliang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 89
  • [5] FSCAM: CAM-Based Feature Selection for Clustering scRNA-seq
    Wang, Yan
    Gao, Jie
    Xuan, Chenxu
    Guan, Tianhao
    Wang, Yujie
    Zhou, Gang
    Ding, Tao
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2022, 14 (02) : 394 - 408
  • [6] Contrastive self-supervised clustering of scRNA-seq data
    Ciortan, Madalina
    Defrance, Matthieu
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [7] GNN-based embedding for clustering scRNA-seq data
    Ciortan, Madalina
    Defrance, Matthieu
    BIOINFORMATICS, 2022, 38 (04) : 1037 - 1044
  • [8] scDSSC: Deep Sparse Subspace Clustering for scRNA-seq Data
    Wang, HaiYun
    Zhao, JianPing
    Zheng, ChunHou
    Su, YanSen
    PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (12)
  • [9] Contrastive self-supervised clustering of scRNA-seq data
    Madalina Ciortan
    Matthieu Defrance
    BMC Bioinformatics, 22
  • [10] FSCAM: CAM-Based Feature Selection for Clustering scRNA-seq
    Yan Wang
    Jie Gao
    Chenxu Xuan
    Tianhao Guan
    Yujie Wang
    Gang Zhou
    Tao Ding
    Interdisciplinary Sciences: Computational Life Sciences, 2022, 14 : 394 - 408