Improving an Electronic Health Record-Based Clinical Prediction Model Under Label Deficiency: Network-Based Generative Adversarial Semisupervised Approach

被引:1
|
作者
Li, Runze [1 ]
Tian, Yu [1 ]
Shen, Zhuyi [1 ]
Li, Jin [2 ]
Li, Jun [3 ]
Ding, Kefeng [3 ]
Li, Jingsong [1 ,4 ]
机构
[1] Zhejiang Univ, Coll Biomed Engn & Instrument Sci, Hangzhou, Peoples R China
[2] Nanjing Univ Informat Sci & Technol, Inst Artificial Intelligence Med, Sch Artificial Intelligence, Nanjing, Peoples R China
[3] Zhejiang Univ, Sch Med, Affiliated Hosp 2, Dept Surg Oncol, Hangzhou, Peoples R China
[4] Zhejiang Univ, Coll Biomed Engn & Instrument Sci, Zhou Yiqing Sci & Technol Bldg.2nd Floor,38 Zheda, Hangzhou 310027, Peoples R China
基金
中国国家自然科学基金;
关键词
semisupervised learning; generative adversarial network; network analysis; label deficiency; clinical prediction; electronic health; record; EHR; adversarial network; data set;
D O I
10.2196/47862
中图分类号
R-058 [];
学科分类号
摘要
Background: Observational biomedical studies facilitate a new strategy for large-scale electronic health record (EHR) utilization to support precision medicine. However, data label inaccessibility is an increasingly important issue in clinical prediction, despite the use of synthetic and semisupervised learning from data. Little research has aimed to uncover the underlying graphical structure of EHRs. Objective: A network-based generative adversarial semisupervised method is proposed. The objective is to train clinical prediction models on label-deficient EHRs to achieve comparable learning performance to supervised methods. Methods: Three public data sets and one colorectal cancer data set gathered from the Second Affiliated Hospital of Zhejiang University were selected as benchmarks. The proposed models were trained on 5% to 25% labeled data and evaluated on classification metrics against conventional semisupervised and supervised methods. The data quality, model security, and memory scalability were also evaluated. Results: The proposed method for semisupervised classification outperforms related semisupervised methods under the same setup, with the average area under the receiver operating characteristics curve (AUC) reaching 0.945, 0.673, 0.611, and 0.588 for the four data sets, respectively, followed by graph-based semisupervised learning (0.450, 0.454, 0.425, and 0.5676, respectively) and label propagation (0.475,0.344, 0.440, and 0.477, respectively). The average classification AUCs with 10% labeled data were 0.929, 0.719, 0.652, and 0.650, respectively, comparable to that of the supervised learning methods logistic regression (0.601, 0.670, 0.731, and 0.710, respectively), support vector machines (0.733, 0.720, 0.720, and 0.721, respectively), and random forests (0.982, 0.750, 0.758, and 0.740, respectively). The concerns regarding the secondary use of data and data security are alleviated by realistic data synthesis and robust privacy preservation.Conclusions: Training clinical prediction models on label-deficient EHRs is indispensable in data-driven research. The proposed method has great potential to exploit the intrinsic structure of EHRs and achieve comparable learning performance to supervised methods.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Validation of an Electronic Health Record-Based Machine Learning Model Compared With Clinical Risk Scores for Gastrointestinal Bleeding
    Shung, Dennis L.
    Chan, Colleen E.
    You, Kisung
    Nakamura, Shinpei
    Saarinen, Theo
    Zheng, Neil S.
    Simonov, Michael
    Li, Darrick K.
    Tsay, Cynthia
    Kawamura, Yuki
    Shen, Matthew
    Hsiao, Allen
    Sekhon, Jasjeet S.
    Laine, Loren
    GASTROENTEROLOGY, 2024, 167 (06) : 1198 - 1212
  • [42] Agile Model Driven Development of Electronic Health Record-Based Specialty Population Registries
    Kannan, Vaishnavi
    Fish, Jason C.
    Willett, DuWayne L.
    2016 3RD IEEE EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL AND HEALTH INFORMATICS, 2016, : 465 - 468
  • [43] Electronic Health Record-Based Absolute Risk Prediction Model for Esophageal Cancer in the Chinese Population: Model Development and External Validation
    Han, Yuting
    Zhu, Xia
    Hu, Yizhen
    Yu, Canqing
    Guo, Yu
    Hang, Dong
    Pang, Yuanjie
    Pei, Pei
    Ma, Hongxia
    Sun, Dianjianyi
    Yang, Ling
    Chen, Yiping
    Du, Huaidong
    Yu, Min
    Chen, Junshi
    Chen, Zhengming
    Huo, Dezheng
    Jin, Guangfu
    Lv, Jun
    Hu, Zhibin
    Shen, Hongbing
    Li, Liming
    JMIR PUBLIC HEALTH AND SURVEILLANCE, 2023, 9 (01):
  • [44] Development Of An Electronic Health Record-based Prediction Model For Hyperkalemia And Clinical Outcomes In Aldosterone Receptor Antagonist-prescribed Heart Failure Patients
    Dumeny, Leanne
    McDonough, Caitrin W.
    Duarte, Julio
    Cavallari, Larisa H.
    CIRCULATION, 2022, 145
  • [45] Generative Adversarial Network-Based Voltage Fault Diagnosis for Electric Vehicles under Unbalanced Data
    Fang, Weidong
    Guo, Yihan
    Zhang, Ji
    ELECTRONICS, 2024, 13 (16)
  • [46] A Practical Electronic Health Record-Based Dry Weight Supervision Model for Hemodialysis Patients
    Bi, Zhaori
    Wang, Mengjing
    Ni, Li
    Ye, Guoxin
    Zhou, Dian
    Yan, Changhao
    Zeng, Xuan
    Chen, Jing
    IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE, 2019, 7
  • [47] Generative adversarial network and convolutional neural network-based EEG imbalanced classification model for seizure detection
    Gao, Bin
    Zhou, Jiazheng
    Yang, Yuying
    Chi, Jinxin
    Yuan, Qi
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2022, 42 (01) : 1 - 15
  • [48] An Electronic Health Record-based Risk Prediction Model for Methicillin-resistant Staphylococcus Aureus in Adults With Suspected Sepsis
    Christensen, M. A.
    Wang, L.
    Nelson, G. E.
    Casey, J. D.
    Semler, M. W.
    Rice, T. W.
    Ward, M. J.
    Qian, E. T.
    AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2024, 209
  • [49] A generative adversarial network-based unified model integrating bias correction and downscaling for global SST
    Yuan, Shijin
    Feng, Xin
    Mu, Bin
    Qin, Bo
    Wang, Xin
    Chen, Yuxuan
    ATMOSPHERIC AND OCEANIC SCIENCE LETTERS, 2024, 17 (01)
  • [50] A generative adversarial network-based unified model integrating bias correction and downscaling for global SST
    Shijin Yuan
    Xin Feng
    Bin Mu
    Bo Qin
    Xin Wang
    Yuxuan Chen
    Atmospheric and Oceanic Science Letters, 2024, 17 (01) : 47 - 54