Generation of Synthetic Tabular Healthcare Data Using Generative Adversarial Networks

被引:4
|
作者
Nik, Alireza Hossein Zadeh [1 ,2 ]
Riegler, Michael A. [1 ,3 ]
Halvorsen, Pal [1 ,4 ]
Storas, Andrea M. [1 ,4 ]
机构
[1] SimulaMet, Oslo, Norway
[2] Univ Stavanger, Stavanger, Norway
[3] Univ Tromso, Tromso, Norway
[4] OsloMet, Oslo, Norway
来源
关键词
Synthetic data generation; Deep learning; Medical data;
D O I
10.1007/978-3-031-27077-2_34
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High-quality tabular data is a crucial requirement for developing data-driven applications, especially healthcare-related ones, because most of the data nowadays collected in this context is in tabular form. However, strict data protection laws complicates the access to medical datasets. Thus, synthetic data has become an ideal alternative for data scientists and healthcare professionals to circumvent such hurdles. Although many healthcare institutions still use the classical de-identification and anonymization techniques for generating synthetic data, deep learning-based generative models such as generative adversarial networks (GANs) have shown a remarkable performance in generating tabular datasets with complex structures. This paper examines the GANs' potential and applicability within the healthcare industry, which often faces serious challenges with insufficient training data and patient records sensitivity. We investigate several state-of-the-art GAN-based models proposed for tabular synthetic data generation. Healthcare datasets with different sizes, numbers of variables, column data types, feature distributions, and inter-variable correlations are examined. Moreover, a comprehensive evaluation framework is defined to evaluate the quality of the synthetic records and the viability of each model in preserving the patients' privacy. The results indicate that the proposed models can generate synthetic datasets that maintain the statistical characteristics, model compatibility and privacy of the original data. Moreover, synthetic tabular healthcare datasets can be a viable option in many data-driven applications. However, there is still room for further improvements in designing a perfect architecture for generating synthetic tabular data.
引用
收藏
页码:434 / 446
页数:13
相关论文
共 50 条
  • [41] Effective data generation for imbalanced learning using conditional generative adversarial networks
    Douzas, Georgios
    Bacao, Fernando
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 91 : 464 - 471
  • [42] Medical Time-Series Data Generation Using Generative Adversarial Networks
    Dash, Saloni
    Yale, Andrew
    Guyon, Isabelle
    Bennett, Kristin P.
    ARTIFICIAL INTELLIGENCE IN MEDICINE (AIME 2020), 2020, : 382 - 391
  • [43] Generation of False Data Injection Attacks using Conditional Generative Adversarial Networks
    Mohammadpourfard, Mostafa
    Ghanaatpishe, Fateme
    Mohammadi, Marziyeh
    Lakshminarayana, Subhash
    Pechenizkiy, Mykola
    2020 IEEE PES INNOVATIVE SMART GRID TECHNOLOGIES EUROPE (ISGT-EUROPE 2020): SMART GRIDS: KEY ENABLERS OF A GREEN POWER SYSTEM, 2020, : 41 - 45
  • [44] Evaluation of a Fintech Sales Synthetic Data Generation Model Using a Generative Adversarial Network
    Lopez, Felipe A.
    Duran-Riveros, Marcia
    Maldonado-Duran, Sebastian
    Ruete, David
    Costa, Giannina
    Coronado-Hernandez, Jairo R.
    Gatica, Gustavo
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS-ICCSA 2024 WORKSHOPS, PT VI, 2024, 14820 : 56 - 70
  • [45] Generative Adversarial Networks in Healthcare: A Case Study on MRI Image Generation
    Cepa, Beatriz
    Brito, Claudia
    Sousa, Antonio
    2023 IEEE 7TH PORTUGUESE MEETING ON BIOENGINEERING, ENBENG, 2023, : 48 - 51
  • [46] Data Augmentation of a Corrosion Dataset for Defect Growth Prediction of Pipelines Using Conditional Tabular Generative Adversarial Networks
    Ma, Haonan
    Geng, Mengying
    Wang, Fan
    Zheng, Wenyue
    Ai, Yibo
    Zhang, Weidong
    MATERIALS, 2024, 17 (05)
  • [47] Generating Synthetic Electronic Health Record Data Using Generative Adversarial Networks: Tutorial
    Yan, Chao
    Zhang, Ziqi
    Nyemba, Steve
    Li, Zhuohang
    JMIR AI, 2024, 3
  • [48] Synthetic Medical Imaging Generation with Generative Adversarial Networks for Plain Radiographs
    McNulty, John R.
    Kho, Lee
    Case, Alexandria L.
    Slater, David
    Abzug, Joshua M.
    Russell, Sybil A.
    APPLIED SCIENCES-BASEL, 2024, 14 (15):
  • [49] Training generative adversarial networks for optical property mapping using synthetic image data
    Osman, A.
    Crowley, J.
    Gordon, G. S. D.
    BIOMEDICAL OPTICS EXPRESS, 2022, 13 (10) : 5171 - 5186
  • [50] Synthetic Traffic Sign Image Generation Applying Generative Adversarial Networks
    Dewi, Christine
    Chen, Rung-Ching
    Liu, Yan-Ting
    VIETNAM JOURNAL OF COMPUTER SCIENCE, 2022, 09 (03) : 333 - 348