Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis

被引:0
|
作者
Malgorzata Wamil
Abdelaali Hassaine
Shishir Rao
Yikuan Li
Mohammad Mamouei
Dexter Canoy
Milad Nazarzadeh
Zeinab Bidel
Emma Copland
Kazem Rahimi
Gholamreza Salimi-Khorshidi
机构
[1] University of Oxford,Deep Medicine, Oxford Martin School
[2] Mayo Clinic Healthcare,Nuffield Department of Women’s and Reproductive Health, Medical Science Division
[3] University of Oxford,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Diabetes is a heterogenous, multimorbid disorder with a large variation in manifestations, trajectories, and outcomes. The aim of this study is to validate a novel machine learning method for the phenotyping of diabetes in the context of comorbidities. Data from 9967 multimorbid patients with a new diagnosis of diabetes were extracted from Clinical Practice Research Datalink. First, using BEHRT (a transformer-based deep learning architecture), the embeddings corresponding to diabetes were learned. Next, topological data analysis (TDA) was carried out to test how different areas in high-dimensional manifold correspond to different risk profiles. The following endpoints were considered when profiling risk trajectories: major adverse cardiovascular events (MACE), coronary artery disease (CAD), stroke (CVA), heart failure (HF), renal failure (RF), diabetic neuropathy, peripheral arterial disease, reduced visual acuity and all-cause mortality. Kaplan Meier curves were plotted for each derived phenotype. Finally, we tested the performance of an established risk prediction model (QRISK) by adding TDA-derived features. We identified four subgroups of patients with diabetes and divergent comorbidity patterns differing in their risk of future cardiovascular, renal, and other microvascular outcomes. Phenotype 1 (young with chronic inflammatory conditions) and phenotype 2 (young with CAD) included relatively younger patients with diabetes compared to phenotypes 3 (older with hypertension and renal disease) and 4 (older with previous CVA), and those subgroups had a higher frequency of pre-existing cardio-renal diseases. Within ten years of follow-up, 2592 patients (26%) experienced MACE, 2515 patients (25%) died, and 2020 patients (20%) suffered RF. QRISK3 model’s AUC was augmented from 67.26% (CI 67.25–67.28%) to 67.67% (CI 67.66–67.69%) by adding specific TDA-derived phenotype and the distances to both extremities of the TDA graph improving its performance in the prediction of CV outcomes. We confirmed the importance of accounting for multimorbidity when risk stratifying heterogenous cohort of patients with new diagnosis of diabetes. Our unsupervised machine learning method improved the prediction of clinical outcomes.
引用
收藏
相关论文
共 50 条
  • [31] A topological model for partial equivariance in deep learning and data analysis
    Ferrari, Lucia
    Frosini, Patrizio
    Quercioli, Nicola
    Tombari, Francesca
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6
  • [32] Learning quantum phase transitions through topological data analysis
    Tirelli, Andrea
    Costa, Natanael C.
    PHYSICAL REVIEW B, 2021, 104 (23)
  • [33] Federated Incremental Learning algorithm based on Topological Data Analysis
    Hu, Kai
    Gong, Sheng
    Li, Lingxiao
    Luo, Yuantu
    Li, YaoGen
    Jiang, Shanshan
    PATTERN RECOGNITION, 2025, 158
  • [34] UPWARD TOPOLOGICAL ANALYSIS OF LARGE CIRCUITS USING DIRECTED GRAPH REPRESENTATION
    STARZYK, JA
    SLIWA, E
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, 1984, 31 (04): : 410 - 414
  • [35] Designing machine learning workflows with an application to topological data analysis
    Cawi, Eric
    La Rosa, Patricio S.
    Nehorai, Arye
    PLOS ONE, 2019, 14 (12):
  • [36] Biomarker discovery studies for patient stratification using machine learning analysis of omics data: a scoping review
    Glaab, Enrico
    Rauschenberger, Armin
    Banzi, Rita
    Gerardi, Chiara
    Garcia, Paula
    Demotes, Jacques
    BMJ OPEN, 2021, 11 (12):
  • [37] Improving risk-stratification of Diabetes complications using temporal data mining
    Sacchi, Lucia
    Dagliati, Arianna
    Segagni, Daniele
    Leporati, Paola
    Chiovato, Luca
    Bellazzi, Riccardo
    2015 37TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2015, : 2131 - 2134
  • [38] Ethereum Price Prediction using Topological Data Analysis
    Hafez, Samia M.
    ElNainay, Mustafa
    Abougabal, Mohamed
    Kosba, Ahmed
    2022 IEEE GLOBAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (GCAIOT), 2022, : 146 - 153
  • [40] Cloud Detection and Characterization using Topological Data Analysis
    Guiang, Chona S.
    Levine, Robert Y.
    REMOTE SENSING OF CLOUDS AND THE ATMOSPHERE XVII; AND LIDAR TECHNOLOGIES, TECHNIQUES, AND MEASUREMENTS FOR ATMOSPHERIC REMOTE SENSING VIII, 2012, 8534