EMR-LIP: A lightweight framework for standardizing the preprocessing of longitudinal irregular data in electronic medical records

被引:0
|
作者
Luo, Jiawei [1 ,2 ,3 ]
Huang, Shixin [4 ,5 ]
Lan, Lan [6 ]
Yang, Shu [7 ]
Cao, Tingqian [8 ]
Yin, Jin [1 ,2 ,3 ]
Qiu, Jiajun [1 ,2 ,3 ]
Yang, Xiaoyan [1 ,2 ,3 ]
Guo, Yingqiang [1 ]
Zhou, Xiaobo [9 ]
机构
[1] Sichuan Univ, West China Hosp, West China Sch Med, Dept Cardiovasc Surg, Chengdu 610041, Sichuan, Peoples R China
[2] Sichuan Univ, West China Hosp, West China Biomed Big Data Ctr, West China Sch Med, Chengdu 610041, Sichuan, Peoples R China
[3] Sichuan Univ, Medx Ctr Informat, Chengdu 610041, Peoples R China
[4] Peoples Hosp Yubei Dist Chongqing, Dept Sci Res, Chongqing 401120, Peoples R China
[5] Chongqing Univ Posts & Telecommun, Sch Commun & Informat Engn, Chongqing 400065, Peoples R China
[6] Capital Med Univ, Beijing Tiantan Hosp, IT Ctr, Beijing 100070, Peoples R China
[7] Chengdu Univ Tradit Chinese Med, Coll Med Informat Engn, Chengdu 610075, Peoples R China
[8] Sichuan Univ, West China Hosp, Integrated Care Management Ctr, Chengdu 610041, Peoples R China
[9] Univ Texas, Ctr Computat Syst Med, McWilliams Sch Biomed Informat, Hlth Sci Ctr Houston, Houston, TX 77030 USA
基金
中国国家自然科学基金;
关键词
Electronic medical records; Longitudinal data; Irregular data; Preprocessing pipeline; Deep learning; PREDICTION; SEPSIS; MODEL;
D O I
10.1016/j.cmpb.2024.108521
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Objective: Longitudinal data from Electronic Medical Records (EMRs) are increasingly utilized to construct predictive models for various clinical tasks, offering enhanced insights into patient health. However, significant discrepancies exist in preprocessing the irregular and intricate EMR data across studies due to the absence of universally accepted tools and standardization methods. This study introduces the Electronic Medical Record Longitudinal Irregular Data Preprocessing (EMR-LIP) framework, a lightweight approach for optimizing the preprocessing of longitudinal, irregular EMR data, aiming to enhance research efficiency, consistency, reproducibility, and comparability. Materials and Methods: EMR-LIP modularizes the preprocessing of longitudinal irregular EMR data, offering tools with a low level of encapsulation. Compared to other pipelines, EMR-LIP categorizes variables in a more granular manner, designing specific preprocessing techniques for each type. To demonstrate its versatility, EMR-LIP was applied in an empirical study to two public EMR databases, MIMIC-IV and eICU-CRD. Data processed with EMRLIP was then used to test several renowned deep learning models on a range of commonly used benchmark tasks. Results: In both the MIMIC-IV and eICU-CRD databases, models based on EMR-LIP showed superior baseline performance compared to previous studies. Interestingly, using data preprocessed by EMR-LIP, traditional models such as LSTM and GRU outperformed more complex models, achieving an AUROC of up to 0.94 for inhospital death prediction. Additionally, models based on EMR-LIP showed stable performance across various resampling intervals and exhibited better fairness in performance across different ethnic groups. Conclusion: EMR-LIP streamlines the preprocessing of irregular longitudinal EMR data, offering an end-to-end solution for model-ready data creation, and has been open-sourced for collaborative refinement by the research community.
引用
收藏
页数:21
相关论文
共 29 条
  • [1] An Error Detecting and Tagging Framework for Reducing Data Entry Errors in Electronic Medical Records (EMR) System
    Ling, Yuan
    An, Yuan
    Liu, Mengwen
    Hu, Xiaohua
    2013 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2013,
  • [2] THE DEVELOPMENT OF A FRAMEWORK TO EVALUATE OUTCOMES WITHIN REAL-TIME DATA FROM ELECTRONIC MEDICAL RECORDS (EMR)
    Zanotto, B.
    Etges, A. P.
    Souza, A. C.
    Dal Bosco, A.
    Cortes, E. G.
    Martins, S. O.
    Polanczyk, C. A.
    VALUE IN HEALTH, 2020, 23 : S284 - S284
  • [3] Validation of LungFlag™ Prediction Model Using Electronic Medical Records (EMR) On Taiwan Data
    Choman, E. N.
    Lanyado, A.
    Israeli, E.
    Olghi, N.
    Jin, Y.
    Tsai, S. -Y.
    Liu, S. -Y.
    Obradovic, M.
    Yang, P. -C.
    JOURNAL OF THORACIC ONCOLOGY, 2024, 19 (10) : S370 - S370
  • [4] Surface Scraping of Intraoperative Hemodynamic Data for the Acquisition and Storage From Epic Electronic Medical Records (EMR)
    Kramer, David C.
    Malaviya, Avinash
    Cela, Alban
    Henick, Steven
    Goldstein, Sheldon
    ANESTHESIA AND ANALGESIA, 2019, 128
  • [5] USE OF ELECTRONIC MEDICAL RECORDS (EMR) FOR ONCOLOGY OUTCOMES RESEARCH: ASSESSING THE COMPARABILITY OF EMR INFORMATION TO PATIENT REGISTRY AND HEALTH CLAIMS DATA
    Lau, E. L.
    Mowat, F. S.
    Kelsh, M. A.
    Legg, J.
    Engel-Nitz, N. M.
    Watson, H. N.
    Collins, H.
    Nordyke, R. J.
    Whyte, J. L.
    VALUE IN HEALTH, 2011, 14 (03) : A178 - A178
  • [6] User-centered Approach to Developing Solutions for Electronic Medical Records: Extending EMR Data Entry
    Dela Cruz, Viktor Mikhael M.
    Pulmano, Christian E.
    Estuar, Ma Regina Justina E.
    HUCAPP: PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 2: HUCAPP, 2020, : 130 - 137
  • [7] From Micro to Macro: Data Driven Phenotyping by Densification of Longitudinal Electronic Medical Records
    Zhou, Jiayu
    Wang, Fei
    Hu, Jianying
    Ye, Jieping
    PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 135 - 144
  • [8] Redactable Blockchain-Enabled Hierarchical Access Control Framework for Data Sharing in Electronic Medical Records
    Zhang, Tianshuai
    Zhang, Leyou
    Wu, Qing
    Mu, Yi
    Rezaeibagha, Fatemeh
    IEEE SYSTEMS JOURNAL, 2023, 17 (02): : 1962 - 1973
  • [9] A framework for de-identification of free-text data in electronic medical records enabling secondary use
    Mercorelli, Louis
    Nguyen, Harrison
    Gartell, Nicole
    Brookes, Martyn
    Morris, Jonathan
    Tam, Charmaine S.
    AUSTRALIAN HEALTH REVIEW, 2022, 46 (03) : 289 - 293
  • [10] USE AND FINDINGS OF K-RAS IN COLORECTAL CANCER (CRC) TESTING IN ADMINISTRATIVE AND ELECTRONIC MEDICAL RECORDS (EMR) DATA FROM 2005 THROUGH 2010
    Seal, B.
    Sullivan, S. D.
    Ramsey, S.
    Kreilick, C.
    Foltz-boklage, S.
    Haslip, S.
    Gilmore, J.
    Sarma, S.
    Asche, C.
    Valluri, S.
    VALUE IN HEALTH, 2012, 15 (04) : A62 - A63