Discovering and Overcoming the Bias in Neoantigen Identification by Unified Machine Learning Models

被引:0
|
作者
Zhang, Ziting
Wu, Wenxu
Wei, Lei
Wang, Xiaowo [1 ]
机构
[1] Tsinghua Univ, Minist Educ, Key Lab Bioinformat, Beijing, Peoples R China
来源
RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, RECOMB 2024 | 2024年 / 14758卷
关键词
neoantigen identification; data bias; machine learning; attention mechanism;
D O I
10.1007/978-1-0716-3989-4_28
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Neoantigens, formed by genetic mutations in tumor cells, are abnormal peptides that can trigger immune responses. Precisely identifying neoantigens from vast mutations is the key to tumor immunotherapy design. There are three main steps in the neoantigen immune process, i.e., binding with MHCs, extracellular presentation, and induction of immunogenicity. Various machine learning methods have been developed to predict the probability of one of the three events, but the overall accuracy of neoantigen identification remains far from satisfactory. To gain a systematic understanding of the key factors of neoantigen identification, we developed a unified transformer-based machine learning framework ImmuBPI that comprised three tasks and achieved state-of-the-art performance. Through cross-task model interpretation, we have discovered an underestimation of data bias for immunogenicity prediction, which has led to skewed discriminatory boundaries of current machine learning models. We designed a mutual information-based debiasing strategy that performed well on mutation variants immunogenicity prediction, a task where current methods fell short. Clustering immunogenic peptides with debiased representations uncovers unique preferences for biophysical properties, such as hydrophobicity and polarity. These observations serve as an important complement to the past understanding that accurately predicting neoantigen is constrained by limited data, highlighting the necessity of bias control. We expect this study will provide novel and insightful perspectives for neoantigen prediction methods and benefit future neoantigen-mediated immunotherapy designs.
引用
收藏
页码:348 / 351
页数:4
相关论文
共 50 条
  • [1] FAIRVIS: Visual Analytics for Discovering Intersectional Bias in Machine Learning
    Cabrera, Angel Alexander
    Epperson, Will
    Hohman, Fred
    Kahng, Minsuk
    Morgenstern, Jamie
    Chau, Duen Horng
    2019 IEEE CONFERENCE ON VISUAL ANALYTICS SCIENCE AND TECHNOLOGY (VAST), 2019, : 46 - 56
  • [2] FAIRNES AND BIAS IN MACHINE LEARNING MODELS
    Langworthy, Andrew
    Journal of the Institute of Telecommunications Professionals, 2023, 17 : 29 - 33
  • [3] Discovering nuclear models from symbolic machine learning
    Munoz, Jose M.
    Udrescu, Silviu M.
    Ruiz, Ronald F. Garcia
    COMMUNICATIONS PHYSICS, 2025, 8 (01):
  • [4] Discovering Interpretable Machine Learning Models in Parallel Coordinates
    Kovalerchuk, Boris
    Hayes, Dustin
    2021 25TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION (IV): AI & VISUAL ANALYTICS & DATA SCIENCE, 2021, : 181 - 188
  • [5] Mitigating Bias in Clinical Machine Learning Models
    Julio C. Perez-Downes
    Andrew S. Tseng
    Keith A. McConn
    Sara M. Elattar
    Olayemi Sokumbi
    Ronnie A. Sebro
    Megan A. Allyse
    Bryan J. Dangott
    Rickey E. Carter
    Demilade Adedinsewo
    Current Treatment Options in Cardiovascular Medicine, 2024, 26 : 29 - 45
  • [6] Mitigating Bias in Clinical Machine Learning Models
    Perez-Downes, Julio C.
    Tseng, Andrew S.
    McConn, Keith A.
    Elattar, Sara M.
    Sokumbi, Olayemi
    Sebro, Ronnie A.
    Allyse, Megan A.
    Dangott, Bryan J.
    Carter, Rickey E.
    Adedinsewo, Demilade
    CURRENT TREATMENT OPTIONS IN CARDIOVASCULAR MEDICINE, 2024, 26 (03) : 29 - 45
  • [7] Statistical quantification of confounding bias in machine learning models
    Spisak, Tamas
    GIGASCIENCE, 2022, 11
  • [8] Bias Discovery in Machine Learning Models for Mental Health
    Mosteiro, Pablo
    Kuiper, Jesse
    Masthoff, Judith
    Scheepers, Floortje
    Spruit, Marco
    INFORMATION, 2022, 13 (05)
  • [9] Toward Reliable and Transferable Machine Learning Potentials: Uniform Training by Overcoming Sampling Bias
    Jeong, Wonseok
    Lee, Kyuhyun
    Yoo, Dongsun
    Lee, Dongheon
    Han, Seungwu
    JOURNAL OF PHYSICAL CHEMISTRY C, 2018, 122 (39): : 22790 - 22795
  • [10] Promoting Machine Abilities of Discovering and Utilizing Knowledge in a Unified Zero-Shot Learning Paradigm
    Mao, Qingyang
    Li, Zhi
    Liu, Qi
    Wu, Likang
    Zhang, Hefu
    Chen, Enhong
    ACM Transactions on Knowledge Discovery from Data, 2024, 19 (01)