MISSING DATA IMPUTATION FOR HEALTH CARE BIG DATA USING DENOISING AUTOENCODER WITH GENERATIVE ADVERSARIAL NETWORK

被引:0
|
作者
Zhang, Yinbing [1 ]
机构
[1] Hubu Univ, Coll Chem & Chem Engn, Wuhan 430062, Hubei, Peoples R China
来源
关键词
Data imputation; missing data; Autoencoders; GAN; Deep learning;
D O I
10.12694/scpe.v25i5.3023
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Missing data imputation is a key topic in healthcare that covers the issues and strategies involved in dealing with partial data in medical records, clinical trials, and health surveys. Data in healthcare might be missing for a variety of reasons, including non-response in surveys, data entry problems, or unrecorded information during therapeutic appointments. This paper introduces a novel approach to impute missing data utilizing a hybrid model that integrates denoising autoencoders with generative adversarial networks (GANs). We begin by highlighting the prevalence of missing data in health care datasets and the potential impact on analytical outcomes. The proposed methodology leverages the denoising autoencoder's ability to reconstruct data from noisy inputs, coupled with the GAN's proficiency in generating synthetic data that is indistinguishable from real data. By combining these two neural network architectures, our model demonstrates an enhanced capability to predict and fill in missing data points effectively. To validate our approach, we conducted experiments on several large-scale health care datasets with varying degrees of artificially introduced missingness. The performance of our model was benchmarked against traditional imputation methods such as mean imputation and k-nearest neighbors, as well as against standalone denoising autoencoders and GANs. Our results indicate a significant improvement in imputation accuracy, as measured by root mean square error (RMSE) and mean absolute error (MAE), confirming the efficacy of the hybrid model in handling missing data in a robust manner.
引用
收藏
页码:3850 / 3857
页数:8
相关论文
共 50 条
  • [21] A generative adversarial network for travel times imputation using trajectory data
    Zhang, Kunpeng
    He, Zhengbing
    Zheng, Liang
    Zhao, Liang
    Wu, Lan
    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2021, 36 (02) : 197 - 212
  • [22] Missing data imputation in a transformer district based on time series imagingencoding and a generative adversarial network
    Liu K.
    Zhou F.
    Zhou H.
    Dianli Xitong Baohu yu Kongzhi/Power System Protection and Control, 2022, 50 (24): : 129 - 136
  • [23] Generative Adversarial Network for Desert Seismic Data Denoising
    Wang, Hongzhou
    Li, Yue
    Dong, Xintong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (08): : 7062 - 7075
  • [24] Missing Features Reconstruction Using a Wasserstein Generative Adversarial Imputation Network
    Friedjungova, Magda
    Vasata, Daniel
    Balatsko, Maksym
    Jirina, Marcel
    COMPUTATIONAL SCIENCE - ICCS 2020, PT IV, 2020, 12140 : 225 - 239
  • [25] Mixed Data Imputation Using Generative Adversarial Networks
    Khan, Wasif
    Zaki, Nazar
    Ahmad, Amir
    Masud, Mohammad Mehedy
    Ali, Luqman
    Ali, Nasloon
    Ahmed, Luai A.
    IEEE ACCESS, 2022, 10 : 124475 - 124490
  • [26] Generative adversarial networks for imputing missing data for big data clinical research
    Weinan Dong
    Daniel Yee Tak Fong
    Jin-sun Yoon
    Eric Yuk Fai Wan
    Laura Elizabeth Bedford
    Eric Ho Man Tang
    Cindy Lo Kuen Lam
    BMC Medical Research Methodology, 21
  • [27] Generative adversarial networks for imputing missing data for big data clinical research
    Dong, Weinan
    Fong, Daniel Yee Tak
    Yoon, Jin-sun
    Wan, Eric Yuk Fai
    Bedford, Laura Elizabeth
    Tang, Eric Ho Man
    Lam, Cindy Lo Kuen
    BMC MEDICAL RESEARCH METHODOLOGY, 2021, 21 (01)
  • [28] Synthetic lung ultrasound data generation using autoencoder with generative adversarial network
    Fatima, Noreen
    Inchingolo, Riccardo
    Smargiassi, Andrea
    Soldati, Gino
    Torri, Elena
    Perrone, Tiziano
    Demi, Libertario
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (03):
  • [29] CN-GAIN: Classification and NormalizationDenormalization-Based Generative Adversarial Imputation Network for Missing SMES Data Imputation
    Sudrajat, Antonius Wahyu
    Ermatita
    Samsuryadi
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2025, 16 (01) : 314 - 322
  • [30] Missing Data Imputation for Real Time-series Data in a Steel Industry using Generative Adversarial Networks
    Sarda, Kisan
    Yerudkar, Amol
    Del Vecchio, Carmen
    IECON 2021 - 47TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2021,