Unsupervised Learning to Subphenotype Heart Failure Patients from Electronic Health Records

被引:0
|
作者
Hackl, Melanie [1 ]
Datta, Suparno [1 ,2 ]
Miotto, Riccardo [2 ]
Bottinger, Erwin [1 ,2 ]
机构
[1] Univ Potsdam, Hasso Plattner Inst, Digital Hlth Ctr, Potsdam, Germany
[2] Icahn Sch Med Mt Sinai, Hasso Plattner Inst Digital Hlth Mt Sinai, New York, NY USA
关键词
Unsupervised learning; Electronic health records; Heart failure;
D O I
10.1007/978-3-030-77211-6_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Heart failure (HF) is a deadly disease and its prevalence is slowly increasing. The sub-types of HF are currently mostly determined by the so-called ejection fraction (EF). In this work, we try to find novel subgroups of heart failure following a complete data-driven approach of clustering patients based on their electronic health records (EHRs). Using a validated phenotyping algorithm we were able to identify 14,334 adult patients with heart failure in our database. We derived embeddings of patients using two different strategies, one processing aggregated clinical features using principal component analysis (PCA) and uniform manifold approximation and projection (UMAP), and one where we learn embeddings from the sequence of medical events using a long short-term memory (LSTM) autoencoder. Then we evaluated different clustering strategies like k-means and agglomerative hierarchical to derive the most informative subtypes. The results were compared based on different metrics such as silhouette coefficient and so on and also based on comparing outcomes such as hospitalization, EF etc. between the clusters. In the most promising result, we were able to identify 3 subclusters using the aggregated data approach in combination with UMAP as dimension reduction method and k-means as cluster method. Patients in cluster 1 had the lowest number of hospital days and comorbidities, while patients in cluster 3 had a significantly higher number of hospital days together with a higher prevalence of comorbidities such as chronic kidney disease and atrial fibrillation. Patients in cluster 2 had a high prevalence of drug allergies in their medical history.
引用
收藏
页码:219 / 228
页数:10
相关论文
共 50 条
  • [31] Deep-learning-based prognostic modeling for incident heart failure in patients with diabetes using electronic health records: A retrospective cohort study
    Gandin, Ilaria
    Saccani, Sebastiano
    Coser, Andrea
    Scagnetto, Arjuna
    Cappelletto, Chiara
    Candido, Riccardo
    Barbati, Giulia
    Di Lenarda, Andrea
    PLOS ONE, 2023, 18 (02):
  • [32] Machine Learning-Driven Models to Predict Prognostic Outcomes in Patients Hospitalized With Heart Failure Using Electronic Health Records: Retrospective Study
    Lv, Haichen
    Yang, Xiaolei
    Wang, Bingyi
    Wang, Shaobo
    Du, Xiaoyan
    Tan, Qian
    Hao, Zhujing
    Liu, Ying
    Yan, Jun
    Xia, Yunlong
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (04)
  • [33] Robust Ensemble Learning to Identify Rare Disease Patients from Electronic Health Records
    Colbaugh, Rich
    Glass, Kristin
    Rudolf, Christopher
    Tremblay, Mike
    2018 40TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2018, : 4085 - 4088
  • [34] Differences in the Documentation of Atrial Fibrillation Symptoms in Electronic Health Records Between Patients With and Without Comorbid Heart Failure
    Hobensack, Mollie
    Turchioe, Meghan Reading
    CIRCULATION, 2022, 146
  • [35] In-Hospital Mortality Prediction for Heart Failure Patients Using Electronic Health Records and an Improved Bagging Algorithm
    Wang, Binhua
    Ma, Xiao
    Wang, Yifei
    Dong, Wei
    Liu, Chengyu
    Bai, Yongyi
    Bian, Suyan
    Ying, Jun
    Hu, Xin
    Wan, Shanshan
    Xue, Wanguo
    Tian, Yaping
    Zhong, Cheng
    Zhang, Yang
    He, Kunlun
    Li, Jiayue
    JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2020, 10 (05) : 998 - 1004
  • [36] A Method for Improving the Identification of Heart Failure Patients for Quantitative Clinical Performance Measures using Electronic Health Records
    Seicean, Sinziana
    Seicean, Andreea
    Marwick, Thomas H.
    CIRCULATION, 2012, 126 (21)
  • [37] Federated Learning for Electronic Health Records
    Dang, Trung Kien
    Lan, Xiang
    Weng, Jianshu
    Feng, Mengling
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2022, 13 (05)
  • [38] Evaluation of electronic health records from viewpoint of patients
    Koide, Daisuke
    Asonuma, Motohiro
    Naito, Keiko
    Igawa, Sumito
    Shimizu, Shiro
    Consumer-Centered Computer-Suppported Care for Healthy People, 2006, 122 : 304 - 308
  • [39] Learning from heterogeneous temporal data in electronic health records
    Zhao, Jing
    Papapetrou, Panagiotis
    Asker, Lars
    Bostrom, Henrik
    JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 65 : 105 - 119
  • [40] Learning a Health Knowledge Graph from Electronic Medical Records
    Rotmensch, Maya
    Halpern, Yoni
    Tlimat, Abdulhakim
    Horng, Steven
    Sontag, David
    SCIENTIFIC REPORTS, 2017, 7