Biases in machine-learning models of human single-cell data

被引:0
|
作者
Willem, Theresa [1 ,2 ]
Shitov, Vladimir A. [3 ,4 ,5 ,6 ]
Luecken, Malte D. [3 ,4 ,5 ,6 ]
Kilbertus, Niki [2 ,7 ,8 ]
Bauer, Stefan [2 ,7 ,8 ]
Piraud, Marie [2 ]
Buyx, Alena [1 ]
Theis, Fabian J. [2 ,7 ,9 ]
机构
[1] Tech Univ Munich, TUM Sch Med & Hlth, Inst Hist & Eth Med, Munich, Germany
[2] Helmholtz Munich, Munich, Germany
[3] Helmholtz Munich, Inst Computat Biol, Dept Computat Hlth, Munich, Germany
[4] Helmholtz Munich, Comprehens Pneumol Ctr CPC CPC M bioArch, Munich, Germany
[5] Helmholtz Munich, Inst Lung Hlth & Immun LHI, Munich, Germany
[6] German Ctr Lung Res DZL, Munich, Germany
[7] Tech Univ Munich, Sch Computat Informat & Technol, Munich, Germany
[8] Munich Ctr Machine Learning MCML, Munich, Germany
[9] Tech Univ Munich, Sch Life Sci, Munich, Germany
关键词
GENOMICS; RACISM; RACE;
D O I
10.1038/s41556-025-01619-8
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Recent machine-learning (ML)-based advances in single-cell data science have enabled the stratification of human tissue donors at single-cell resolution, promising to provide valuable diagnostic and prognostic insights. However, such insights are susceptible to biases. Here we discuss various biases that emerge along the pipeline of ML-based single-cell analysis, ranging from societal biases affecting whose samples are collected, to clinical and cohort biases that influence the generalizability of single-cell datasets, biases stemming from single-cell sequencing, ML biases specific to (weakly supervised or unsupervised) ML models trained on human single-cell samples and biases during the interpretation of results from ML models. We end by providing methods for single-cell data scientists to assess and mitigate biases, and call for efforts to address the root causes of biases.
引用
收藏
页码:384 / 392
页数:9
相关论文
共 50 条
  • [41] Robot polishing cell for machine-learning
    Schneckenburger, Max
    Garcia, Luis
    Boerret, Rainer
    SIXTH EUROPEAN SEMINAR ON PRECISION OPTICS MANUFACTURING, 2019, 11171
  • [42] Algorithmic advances in machine learning for single-cell expression analysis
    Oller-Moreno, Sergio
    Kloiber, Karin
    Machart, Pierre
    Bonn, Stefan
    CURRENT OPINION IN SYSTEMS BIOLOGY, 2021, 25 : 27 - 33
  • [43] Advancing interpretability of machine-learning prediction models
    Trenary, Laurie
    DelSole, Timothy
    ENVIRONMENTAL DATA SCIENCE, 2022, 1
  • [44] Identifying In Vitro Cultured Human Hepatocytes Markers with Machine Learning Methods Based on Single-Cell RNA-Seq Data
    Li, ZhanDong
    Huang, FeiMing
    Chen, Lei
    Huang, Tao
    Cai, Yu-Dong
    FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2022, 10
  • [45] Synchronization of chaotic systems and their machine-learning models
    Weng, Tongfeng
    Yang, Huijie
    Gu, Changgui
    Zhang, Jie
    Small, Michael
    PHYSICAL REVIEW E, 2019, 99 (04)
  • [46] Machine-learning models for combinatorial catalyst discovery
    Landrum, GA
    Penzotti, J
    Putta, S
    COMBINATORIAL AND ARTIFICIAL INTELLIGENCE METHODS IN MATERIALS SCIENCE II, 2004, 804 : 301 - 306
  • [47] Machine-learning models for combinatorial catalyst discovery
    Landrum, GA
    Penzotti, JE
    Putta, S
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2005, 16 (01) : 270 - 277
  • [48] An introduction to representation learning for single-cell data analysis
    Gunawan, Ihuan
    Vafaee, Fatemeh
    Meijering, Erik
    Lock, John George
    CELL REPORTS METHODS, 2023, 3 (08):
  • [49] Machine Learning Challenges for Single Cell Data
    Van Gassen, Sofie
    Dhaene, Tom
    Saeys, Yvan
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2016, PT III, 2016, 9853 : 275 - 279
  • [50] Data denoising with transfer learning in single-cell transcriptomics
    Wang, Jingshu
    Agarwal, Divyansh
    Huang, Mo
    Hu, Gang
    Zhou, Zilu
    Ye, Chengzhong
    Zhang, Nancy R.
    NATURE METHODS, 2019, 16 (09) : 875 - +