Biases in machine-learning models of human single-cell data

被引:0
|
作者
Willem, Theresa [1 ,2 ]
Shitov, Vladimir A. [3 ,4 ,5 ,6 ]
Luecken, Malte D. [3 ,4 ,5 ,6 ]
Kilbertus, Niki [2 ,7 ,8 ]
Bauer, Stefan [2 ,7 ,8 ]
Piraud, Marie [2 ]
Buyx, Alena [1 ]
Theis, Fabian J. [2 ,7 ,9 ]
机构
[1] Tech Univ Munich, TUM Sch Med & Hlth, Inst Hist & Eth Med, Munich, Germany
[2] Helmholtz Munich, Munich, Germany
[3] Helmholtz Munich, Inst Computat Biol, Dept Computat Hlth, Munich, Germany
[4] Helmholtz Munich, Comprehens Pneumol Ctr CPC CPC M bioArch, Munich, Germany
[5] Helmholtz Munich, Inst Lung Hlth & Immun LHI, Munich, Germany
[6] German Ctr Lung Res DZL, Munich, Germany
[7] Tech Univ Munich, Sch Computat Informat & Technol, Munich, Germany
[8] Munich Ctr Machine Learning MCML, Munich, Germany
[9] Tech Univ Munich, Sch Life Sci, Munich, Germany
关键词
GENOMICS; RACISM; RACE;
D O I
10.1038/s41556-025-01619-8
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Recent machine-learning (ML)-based advances in single-cell data science have enabled the stratification of human tissue donors at single-cell resolution, promising to provide valuable diagnostic and prognostic insights. However, such insights are susceptible to biases. Here we discuss various biases that emerge along the pipeline of ML-based single-cell analysis, ranging from societal biases affecting whose samples are collected, to clinical and cohort biases that influence the generalizability of single-cell datasets, biases stemming from single-cell sequencing, ML biases specific to (weakly supervised or unsupervised) ML models trained on human single-cell samples and biases during the interpretation of results from ML models. We end by providing methods for single-cell data scientists to assess and mitigate biases, and call for efforts to address the root causes of biases.
引用
收藏
页码:384 / 392
页数:9
相关论文
共 50 条
  • [31] Hierarchical progressive learning of cell identities in single-cell data
    Lieke Michielsen
    Marcel J. T. Reinders
    Ahmed Mahfouz
    Nature Communications, 12
  • [32] An orchestra of machine learning methods reveals landmarks in single-cell data exemplified with aging fibroblasts
    Rasbach, Lauritz
    Caliskan, Aylin
    Saderi, Fatemeh
    Dandekar, Thomas
    Breitenbach, Tim
    PLOS ONE, 2024, 19 (04):
  • [33] Predicting single-cell gene expression profiles of imaging flow cytometry data with machine learning
    Chlis, Nikolaos-Kosmas
    Rausch, Lisa
    Brocker, Thomas
    Kranich, Jan
    Theis, Fabian J.
    NUCLEIC ACIDS RESEARCH, 2020, 48 (20) : 11335 - 11346
  • [34] Chord: an ensemble machine learning algorithm to identify doublets in single-cell RNA sequencing data
    Xiong, Ke-Xu
    Zhou, Han-Lin
    Lin, Cong
    Yin, Jian-Hua
    Kristiansen, Karsten
    Yang, Huan-Ming
    Li, Gui-Bo
    COMMUNICATIONS BIOLOGY, 2022, 5 (01)
  • [35] Combining Pathway Analysis and Supervised Machine Learning for the Functional Classification of Single-Cell Transcriptomic Data
    Koutsandreas, Thodoris
    Bajram, Ajdini
    Mastrokalou, Chara
    Pilalis, Eleftherios
    Chatziioannou, Aristotelis
    Maglogiannis, Ilias
    2019 IEEE 19TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2019, : 861 - 866
  • [36] Chord: an ensemble machine learning algorithm to identify doublets in single-cell RNA sequencing data
    Ke-Xu Xiong
    Han-Lin Zhou
    Cong Lin
    Jian-Hua Yin
    Karsten Kristiansen
    Huan-Ming Yang
    Gui-Bo Li
    Communications Biology, 5
  • [37] Using machine learning technology to identify platelet and megakaryocyte types in single-cell transcriptome data
    Wu, Jingyan
    PROCEEDINGS OF 2024 4TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND INTELLIGENT COMPUTING, BIC 2024, 2024, : 491 - 494
  • [38] Interpretable machine-learning models for estimating trip purpose in smart card data
    Kim, Eui-Jin
    Kim, Youngseo
    Kim, Dong-Kyu
    PROCEEDINGS OF THE INSTITUTION OF CIVIL ENGINEERS-MUNICIPAL ENGINEER, 2021, 174 (02) : 108 - 117
  • [39] Single-Cell Drug Perturbations Prediction Using Machine Learning
    Prajapati, Manish
    Baliarsingh, Santos Kumar
    Dev, Prabhu Prasad
    Nayak, Sankalp
    Biswal, Manas Ranjan
    ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2023, PT IV, 2024, 2093 : 36 - 49
  • [40] Machine learning-based prediction for single-cell mechanics
    Nguyen, Danh
    Tao, Lei
    Ye, Huilin
    Li, Ying
    MECHANICS OF MATERIALS, 2023, 180