Multi-Label Clinical Time-Series Generation via Conditional GAN

被引:4
|
作者
Lu, Chang [1 ]
Reddy, Chandan K. [2 ]
Wang, Ping [1 ]
Nie, Dong [3 ]
Ning, Yue [1 ]
机构
[1] Stevens Inst Technol, Dept Comp Sci, Hoboken, NJ 07310 USA
[2] Virginia Tech, Dept Comp Sci, Arlington, VA 22203 USA
[3] Univ North Carolina Chapel Hill, Dept Comp Sci, Chapel Hill, NC 27599 USA
基金
美国国家科学基金会;
关键词
Diseases; Generative adversarial networks; Generators; Training; Task analysis; Synthetic data; Measurement; Electronic health records; generative adversarial network (GAN); time-series generation; imbalanced data;
D O I
10.1109/TKDE.2023.3310909
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, deep learning has been successfully adopted in a wide range of applications related to electronic health records (EHRs) such as representation learning and clinical event prediction. However, due to privacy constraints, limited access to EHR becomes a bottleneck for deep learning research. To mitigate these concerns, generative adversarial networks (GANs) have been successfully used for generating EHR data. However, there are still challenges in high-quality EHR generation, including generating time-series EHR data and imbalanced uncommon diseases. In this work, we propose a Multi-label Time-series GAN (MTGAN) to generate EHR and simultaneously improve the quality of uncommon disease generation. The generator of MTGAN uses a gated recurrent unit (GRU) with a smooth conditional matrix to generate sequences and uncommon diseases. The critic gives scores using Wasserstein distance to recognize real samples from synthetic samples by considering both data and temporal features. We also propose a training strategy to calculate temporal features for real data and stabilize GAN training. Furthermore, we design multiple statistical metrics and prediction tasks to evaluate the generated data. Experimental results demonstrate the quality of the synthetic data and the effectiveness of MTGAN in generating realistic sequential EHR data, especially for uncommon diseases.
引用
收藏
页码:1728 / 1740
页数:13
相关论文
共 50 条
  • [21] Multi-label Prediction in Time Series Data using Deep Neural Networks
    Zhang, Wenyu
    Jha, Devesh K.
    Laftchiev, Emil
    Nikovski, Daniel
    INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT, 2019, 10
  • [22] Label generation with consistency on the graph for multi-label feature selection
    Hao, Pingting
    Zhang, Ping
    Feng, Qi
    Gao, Wanfu
    INFORMATION SCIENCES, 2024, 677
  • [23] Research of ddi based on multi-label conditional random field
    Yu, Yangzhi
    Deng, Hongtao
    Zhu, Xun
    2016 INTERNATIONAL CONFERENCE ON MEDICINE SCIENCES AND BIOENGINEERING (ICMSB2016), 2017, 8
  • [24] Conditional entropy based classifier chains for multi-label classification
    Xie Jun
    Yu Lu
    Zhu Lei
    Duan Guolun
    NEUROCOMPUTING, 2019, 335 : 185 - 194
  • [25] Synthetic Time-Series Load Data via Conditional Generative Adversarial Networks
    Pinceti, Andrea
    Sankar, Lalitha
    Kosut, Oliver
    2021 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2021,
  • [26] GENERATION OF TIME-SERIES - GENTS
    RISTROPH, JH
    COMPUTERS & INDUSTRIAL ENGINEERING, 1991, 21 (1-4) : 185 - 189
  • [27] CONDITIONAL FORECASTING WITH A MULTIVARIATE TIME-SERIES MODEL
    VANDERKNOOP, HS
    ECONOMICS LETTERS, 1986, 22 (2-3) : 233 - 236
  • [28] A TIME-SERIES ILLUSTRATION OF APPROXIMATE CONDITIONAL LIKELIHOOD
    CRUDDAS, AM
    REID, N
    COX, DR
    BIOMETRIKA, 1989, 76 (02) : 231 - 237
  • [29] Label Embedding for Multi-label Classification Via Dependence Maximization
    Yachong Li
    Youlong Yang
    Neural Processing Letters, 2020, 52 : 1651 - 1674
  • [30] Multi-label Classification via Label-Topic Pairs
    Chen, Gang
    Peng, Yue
    Wang, Chongjun
    WEB AND BIG DATA (APWEB-WAIM 2018), PT I, 2018, 10987 : 32 - 44