Deep representation learning for clustering longitudinal survival data from electronic health records

被引:0
|
作者
Qiu, Jiajun [1 ]
Hu, Yao [1 ]
Li, Li [1 ]
Erzurumluoglu, Abdullah Mesut [1 ]
Braenne, Ingrid [1 ]
Whitehurst, Charles [2 ]
Schmitz, Jochen [2 ]
Arora, Jatin [1 ]
Bartholdy, Boris Alexander [1 ]
Gandhi, Shrey [1 ]
Khoueiry, Pierre [1 ]
Mueller, Stefanie [1 ]
Noyvert, Boris [1 ]
Ding, Zhihao [1 ]
Jensen, Jan Nygaard [1 ]
de Jong, Johann [1 ]
机构
[1] Boehringer Ingelheim Pharm GmbH Co KG, Global Computat Biol & Digital Sci, Biberach, Germany
[2] Boehringer Ingelheim GmbH & Co KG, Immunol & Resp Dis, Ridgefield, CT USA
关键词
GENOME-WIDE ASSOCIATION; LIKELIHOOD; VARIANTS; MEDICINE;
D O I
10.1038/s41467-025-56625-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Precision medicine requires accurate identification of clinically relevant patient subgroups. Electronic health records provide major opportunities for leveraging machine learning approaches to uncover novel patient subgroups. However, many existing approaches fail to adequately capture complex interactions between diagnosis trajectories and disease-relevant risk events, leading to subgroups that can still display great heterogeneity in event risk and underlying molecular mechanisms. To address this challenge, we implemented VaDeSC-EHR, a transformer-based variational autoencoder for clustering longitudinal survival data as extracted from electronic health records. We show that VaDeSC-EHR outperforms baseline methods on both synthetic and real-world benchmark datasets with known ground-truth cluster labels. In an application to Crohn's disease, VaDeSC-EHR successfully identifies four distinct subgroups with divergent diagnosis trajectories and risk profiles, revealing clinically and genetically relevant factors in Crohn's disease. Our results show that VaDeSC-EHR can be a powerful tool for discovering novel patient subgroups in the development of precision medicine approaches.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Deep representation learning of patient data from Electronic Health Records (EHR): A systematic review
    Si, Yuqi
    Du, Jingcheng
    Li, Zhao
    Jiang, Xiaoqian
    Miller, Timothy
    Wang, Fei
    Zheng, W. Jim
    Roberts, Kirk
    JOURNAL OF BIOMEDICAL INFORMATICS, 2021, 115
  • [2] Deep Stable Representation Learning on Electronic Health Records
    Luo, Yingtao
    Liu, Zhaocheng
    Liu, Qiang
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 1077 - 1082
  • [3] Predicting hypertension onset from longitudinal electronic health records with deep learning
    Datta, Suparno
    Morassi Sasso, Ariane
    Kiwit, Nina
    Bose, Subhronil
    Nadkarni, Girish
    Miotto, Riccardo
    Boettinger, Erwin P.
    JAMIA OPEN, 2022, 5 (04)
  • [4] Functional clustering methods for longitudinal data with application to electronic health records
    Zeldow, Bret
    Flory, James
    Stephens-Shields, Alisa
    Raebel, Marsha
    Roy, Jason A.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (03) : 655 - 670
  • [5] Deep learning for temporal data representation in electronic health records: A systematic review of challenges and methodologies
    Xie, Feng
    Yuan, Han
    Ning, Yilin
    Ong, Marcus Eng Hock
    Feng, Mengling
    Hsu, Wynne
    Chakraborty, Bibhas
    Liu, Nan
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 126
  • [6] LEARNING HEALTHCARE DELIVERY NETWORK WITH LONGITUDINAL ELECTRONIC HEALTH RECORDS DATA
    Sun, Jiehuan
    Liao, Katherine P.
    Cai, Tianxi
    ANNALS OF APPLIED STATISTICS, 2024, 18 (01): : 882 - 898
  • [7] Deep representation learning of electronic health records to unlock patient stratification at scale
    Landi, Isotta
    Glicksberg, Benjamin S.
    Lee, Hao-Chih
    Cherng, Sarah
    Landi, Giulia
    Danieletto, Matteo
    Dudley, Joel T.
    Furlanello, Cesare
    Miotto, Riccardo
    NPJ DIGITAL MEDICINE, 2020, 3 (01)
  • [8] Deep representation learning of electronic health records to unlock patient stratification at scale
    Isotta Landi
    Benjamin S. Glicksberg
    Hao-Chih Lee
    Sarah Cherng
    Giulia Landi
    Matteo Danieletto
    Joel T. Dudley
    Cesare Furlanello
    Riccardo Miotto
    npj Digital Medicine, 3
  • [9] Multi-task deep representation learning method for electronic health records
    Yang, Shan
    Zheng, Xiangwei
    Chen, Xuanchi
    Wei, Yi
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 1188 - 1192
  • [10] Representation Learning for Electronic Health Records: A Survey
    Chen, Peiying
    2020 4TH INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND ARTIFICIAL INTELLIGENCE (CCEAI 2020), 2020, 1487