Predicting hypertension onset from longitudinal electronic health records with deep learning

被引:11
|
作者
Datta, Suparno [1 ,2 ]
Morassi Sasso, Ariane [1 ,2 ]
Kiwit, Nina [1 ]
Bose, Subhronil [1 ]
Nadkarni, Girish [1 ,2 ,3 ]
Miotto, Riccardo [2 ,4 ]
Boettinger, Erwin P. [1 ,2 ,3 ,5 ]
机构
[1] Univ Potsdam, Hasso Plattner Inst, Digital Hlth Ctr, Potsdam, Germany
[2] Icahn Sch Med Mt Sinai, Hasso Plattner Inst Digital Hlth Mt Sinai, New York, NY 10029 USA
[3] Icahn Sch Med Mt Sinai, Dept Med, New York, NY 10029 USA
[4] Icahn Sch Med Mt Sinai, Dept Genet & Genom Sci, New York, NY 10029 USA
[5] Icahn Sch Med Mt Sinai, Windreich Dept Artificial Intelligence & Human Hl, New York, NY 10029 USA
基金
美国国家卫生研究院;
关键词
machine learning; electronic health records; deep learning; hypertension; HIGH BLOOD-PRESSURE; INCIDENT HYPERTENSION; AMERICAN-COLLEGE; RISK; PREVENTION; MANAGEMENT; ADULTS;
D O I
10.1093/jamiaopen/ooac097
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objective: Hypertension has long been recognized as one of the most important predisposing factors for cardiovascular diseases and mortality. In recent years, machine learning methods have shown potential in diagnostic and predictive approaches in chronic diseases. Electronic health records (EHRs) have emerged as a reliable source of longitudinal data. The aim of this study is to predict the onset of hypertension using modern deep learning (DL) architectures, specifically long short-term memory (LSTM) networks, and longitudinal EHRs. Materials and Methods: We compare this approach to the best performing models reported from previous works, particularly XGboost, applied to aggregated features. Our work is based on data from 233 895 adult patients from a large health system in the United States. We divided our population into 2 distinct longitudinal datasets based on the diagnosis date. To ensure generalization to unseen data, we trained our models on the first dataset (dataset A "train and validation") using cross-validation, and then applied the models to a second dataset (dataset B "test") to assess their performance. We also experimented with 2 different time-windows before the onset of hypertension and evaluated the impact on model performance. Results: With the LSTM network, we were able to achieve an area under the receiver operating characteristic curve value of 0.98 in the "train and validation" dataset A and 0.94 in the "test" dataset B for a prediction time window of 1 year. Lipid disorders, type 2 diabetes, and renal disorders are found to be associated with incident hypertension. Conclusion: These findings show that DL models based on temporal EHR data can improve the identification of patients at high risk of hypertension and corresponding driving factors. In the long term, this work may support identifying individuals who are at high risk for developing hypertension and facilitate earlier intervention to prevent the future development of hypertension.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Predicting disease onset from electronic health records for population health management: a scalable and explainable Deep Learning approach
    Grout, Robert
    Gupta, Rishab
    Bryant, Ruby
    Elmahgoub, Mawada A.
    Li, Yijie
    Irfanullah, Khushbakht
    Patel, Rahul F.
    Fawkes, Jake
    Inness, Catherine
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 6
  • [2] Predicting the onset of type 2 diabetes using wide and deep learning with electronic health records
    Nguyen, Binh P.
    Pham, Hung N.
    Tran, Hop
    Nghiem, Nhung
    Nguyen, Quang H.
    Do, Trang T. T.
    Cao Truong Tran
    Simpson, Colin R.
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2019, 182
  • [3] Predicting Suicidal Behavior From Longitudinal Electronic Health Records
    Barak-Corren, Yuval
    Castro, Victor M.
    Javitt, Solomon
    Hoffnagle, Alison G.
    Dai, Yael
    Perlis, Roy H.
    Nock, Matthew K.
    Smoller, Jordan W.
    Reis, Ben Y.
    AMERICAN JOURNAL OF PSYCHIATRY, 2017, 174 (02): : 154 - 162
  • [4] Deep representation learning for clustering longitudinal survival data from electronic health records
    Qiu, Jiajun
    Hu, Yao
    Li, Li
    Erzurumluoglu, Abdullah Mesut
    Braenne, Ingrid
    Whitehurst, Charles
    Schmitz, Jochen
    Arora, Jatin
    Bartholdy, Boris Alexander
    Gandhi, Shrey
    Khoueiry, Pierre
    Mueller, Stefanie
    Noyvert, Boris
    Ding, Zhihao
    Jensen, Jan Nygaard
    de Jong, Johann
    NATURE COMMUNICATIONS, 2025, 16 (01)
  • [5] Predicting Hepatocellular Carcinoma With Minimal Features From Electronic Health Records: Development of a Deep Learning Model
    Liang, Chia-Wei
    Yang, Hsuan-Chia
    Islam, Md Mohaimenul
    Nguyen, Phung Anh Alex
    Feng, Yi-Ting
    Hou, Ze Yu
    Huang, Chih-Wei
    Poly, Tahmina Nasrin
    Li, Yu-Chuan Jack
    JMIR CANCER, 2021, 7 (04):
  • [6] Predicting Cardiovascular Health Trajectories in Time-series Electronic Health Records With Deep Learning
    Guo, Aixia
    Foraker, Randi E.
    CIRCULATION, 2019, 140
  • [7] Deep Learning for Electronic Health Records Analytics
    Harerimana, Gaspard
    Kim, Jong Wook
    Yoo, Hoon
    Jang, Beakcheol
    IEEE ACCESS, 2019, 7 : 101245 - 101259
  • [8] A Survey of Deep Learning for Electronic Health Records
    Xu, Jiabao
    Xi, Xuefeng
    Chen, Jie
    Sheng, Victor S.
    Ma, Jieming
    Cui, Zhiming
    APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [9] Predicting opioid dependence from electronic health records with machine learning
    Ellis, Randall J.
    Wang, Zichen
    Genes, Nicholas
    Ma'ayan, Avi
    BIODATA MINING, 2019, 12 (1)
  • [10] Predicting opioid dependence from electronic health records with machine learning
    Randall J. Ellis
    Zichen Wang
    Nicholas Genes
    Avi Ma’ayan
    BioData Mining, 12