Biases in machine-learning models of human single-cell data

被引:0
|
作者
Willem, Theresa [1 ,2 ]
Shitov, Vladimir A. [3 ,4 ,5 ,6 ]
Luecken, Malte D. [3 ,4 ,5 ,6 ]
Kilbertus, Niki [2 ,7 ,8 ]
Bauer, Stefan [2 ,7 ,8 ]
Piraud, Marie [2 ]
Buyx, Alena [1 ]
Theis, Fabian J. [2 ,7 ,9 ]
机构
[1] Tech Univ Munich, TUM Sch Med & Hlth, Inst Hist & Eth Med, Munich, Germany
[2] Helmholtz Munich, Munich, Germany
[3] Helmholtz Munich, Inst Computat Biol, Dept Computat Hlth, Munich, Germany
[4] Helmholtz Munich, Comprehens Pneumol Ctr CPC CPC M bioArch, Munich, Germany
[5] Helmholtz Munich, Inst Lung Hlth & Immun LHI, Munich, Germany
[6] German Ctr Lung Res DZL, Munich, Germany
[7] Tech Univ Munich, Sch Computat Informat & Technol, Munich, Germany
[8] Munich Ctr Machine Learning MCML, Munich, Germany
[9] Tech Univ Munich, Sch Life Sci, Munich, Germany
关键词
GENOMICS; RACISM; RACE;
D O I
10.1038/s41556-025-01619-8
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Recent machine-learning (ML)-based advances in single-cell data science have enabled the stratification of human tissue donors at single-cell resolution, promising to provide valuable diagnostic and prognostic insights. However, such insights are susceptible to biases. Here we discuss various biases that emerge along the pipeline of ML-based single-cell analysis, ranging from societal biases affecting whose samples are collected, to clinical and cohort biases that influence the generalizability of single-cell datasets, biases stemming from single-cell sequencing, ML biases specific to (weakly supervised or unsupervised) ML models trained on human single-cell samples and biases during the interpretation of results from ML models. We end by providing methods for single-cell data scientists to assess and mitigate biases, and call for efforts to address the root causes of biases.
引用
收藏
页码:384 / 392
页数:9
相关论文
共 50 条
  • [1] Editorial: Machine Learning and Mathematical Models for Single-Cell Data Analysis
    Ou-Yang, Le
    Zhang, Xiao-Fei
    Zhang, Jiajun
    Chen, Jin
    Wu, Min
    FRONTIERS IN GENETICS, 2022, 13
  • [2] The Trifecta of Single-Cell, Systems-Biology, and Machine-Learning Approaches
    Weiskittel, Taylor M.
    Correia, Cristina
    Yu, Grace T.
    Ung, Choong Yong
    Kaufmann, Scott H.
    Billadeau, Daniel D.
    Li, Hu
    GENES, 2021, 12 (07)
  • [3] New interpretable machine-learning method for single-cell data reveals correlates of clinical response to cancer immunotherapy
    Greene, Evan
    Finak, Greg
    D'Amico, Leonard A.
    Bhardwaj, Nina
    Church, Candice D.
    Morishima, Chihiro
    Ramchurren, Nirasha
    Taube, Janis M.
    Nghiem, Paul T.
    Cheever, Martin A.
    Fling, Steven P.
    Gottardo, Raphael
    PATTERNS, 2021, 2 (12):
  • [4] Machine Learning Approaches to Single-Cell Data Integration and Translation
    Uhler, Caroline
    Shivashankar, G., V
    PROCEEDINGS OF THE IEEE, 2022, 110 (05) : 557 - 576
  • [5] Single-Cell Data Analytics in ScOrange (General Machine Learning)
    Strazar, Martin
    Zagar, Lan
    Kokosar, Jaka
    Tanko, Vesna
    Policar, Pavlin
    Erjavec, Ales
    Pretnar, Ajda
    Staric, Anze
    Menon, Vilas
    Chen, Rui
    Shaulsky, Gad
    Lemire, Andrew
    Parikh, Anup
    Zupan, Blaz
    ARTIFICIAL INTELLIGENCE IN MEDICINE, AIME 2019, 2019, 11526 : 425 - 426
  • [6] Identification of Kidney Cell Types in Single-Cell RNA Sequencing and Single-Nucleus RNA Sequencing Data Using Machine-Learning Algorithms
    Madapoosi, Siddharth S.
    Tisch, Adam
    Blough, Stephen A.
    Rosa, Jan S.
    Eddy, Sean
    Naik, Abhijit S.
    Limonte, Christine P.
    McCown, Phillip J.
    Menon, Rajasree
    Rosas, Sylvia E.
    Parikh, Chirag R.
    Mariani, Laura H.
    Kretzler, Matthias
    Mahfouz, Ahmed
    Alakwaa, Fadhl
    JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY, 2024, 35 (10):
  • [7] Fairness in the Eyes of the Data: Certifying Machine-Learning Models
    Segal, Shahar
    Adi, Yossi
    Pinkas, Benny
    Baum, Carsten
    Ganesh, Chaya
    Keshet, Joseph
    AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2021, : 926 - 935
  • [8] Single-cell specific and interpretable machine learning models for sparse scChIP-seq data imputation
    Albrecht, Steffen
    Andreani, Tommaso
    Andrade-Navarro, Miguel A.
    Fontaine, Jean Fred
    PLOS ONE, 2022, 17 (07):
  • [9] Statistical and machine learning methods for immunoprofiling based on single-cell data
    Zhang, Jingxuan
    Li, Jia
    Lin, Lin
    HUMAN VACCINES & IMMUNOTHERAPEUTICS, 2023, 19 (02)
  • [10] Certified Machine-Learning Models
    Damiani, Ernesto
    Ardagna, Claudio A.
    SOFSEM 2020: THEORY AND PRACTICE OF COMPUTER SCIENCE, 2020, 12011 : 3 - 15