Deep imputation of missing values in time series health data: A review with benchmarking

被引:13
|
作者
Kazijevs, Maksims [1 ]
Samad, Manar D. [1 ]
机构
[1] Tennessee State Univ, Dept Comp Sci, Nashville, TN 37209 USA
基金
美国国家卫生研究院;
关键词
Time series; Multivariate data; Longitudinal imputation; Cross-sectional imputation; Missing value imputation; Deep neural network; Electronic health records; Sensor data;
D O I
10.1016/j.jbi.2023.104440
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The imputation of missing values in multivariate time series (MTS) data is critical in ensuring data quality and producing reliable data-driven predictive models. Apart from many statistical approaches, a few recent studies have proposed state-of-the-art deep learning methods to impute missing values in MTS data. However, the evaluation of these deep methods is limited to one or two data sets, low missing rates, and completely random missing value types. This survey performs six data-centric experiments to benchmark state-of-the-art deep imputation methods on five time series health data sets. Our extensive analysis reveals that no single imputation method outperforms the others on all five data sets. The imputation performance depends on data types, individual variable statistics, missing value rates, and types. Deep learning methods that jointly perform cross-sectional (across variables) and longitudinal (across time) imputations of missing values in time series data yield statistically better data quality than traditional imputation methods. Although computationally expensive, deep learning methods are practical given the current availability of high-performance computing resources, especially when data quality and sample size are of paramount importance in healthcare informatics. Our findings highlight the importance of data-centric selection of imputation methods to optimize data-driven predictive models.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Missing values imputation in ocean buoy time series data
    Chakraborty, Samarpan
    Ide, Kayo
    Balachandran, Balakumar
    OCEAN ENGINEERING, 2025, 318
  • [2] Time Series Data and Recent Imputation Techniques for Missing Data: A Review
    Zainuddin, Aznilinda
    Hairuddin, Muhammad Asraf
    Yassin, Ahmad Ihsan Mohd
    Abd Latiff, Zatul Iffah
    Azhar, Aziemah
    2022 INTERNATIONAL CONFERENCE ON GREEN ENERGY, COMPUTING AND SUSTAINABLE TECHNOLOGY (GECOST), 2022, : 346 - 350
  • [3] A novel imputation method for missing values in air pollutant time series data
    Pena, Mario
    Ortega, Patricia
    Orellana, Marcos
    2019 IEEE LATIN AMERICAN CONFERENCE ON COMPUTATIONAL INTELLIGENCE (LA-CCI), 2019, : 99 - 104
  • [4] IMPUTATION FOR CONSECUTIVE MISSING VALUES IN NON-STATIONARY TIME SERIES DATA
    Wongoutong, Chantha
    ADVANCES AND APPLICATIONS IN STATISTICS, 2020, 64 (01) : 87 - 102
  • [5] A bagging algorithm for the imputation of missing values in time series
    Andiojaya, Agung
    Demirhan, Haydar
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 129 : 10 - 26
  • [6] Imputation of Missing Values in Time Series with Lagged Correlations
    Rahman, Shah Atiqur
    Huang, Yuxiao
    Claassen, Jan
    Kleinberg, Samantha
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2014, : 753 - 762
  • [7] Recurrent Imputation for Multivariate Time Series with Missing Values
    Suo, Qiuling
    Yao, Liuyi
    Xun, Guangxu
    Sun, Jianhui
    Zhang, Aidong
    2019 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2019, : 562 - 564
  • [8] An unsupervised neural network approach for imputation of missing values in univariate time series data
    Savarimuthu, Nickolas
    Karesiddaiah, Shobha
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (09):
  • [9] Augmenting energy time-series for data-efficient imputation of missing values
    Liguori, Antonio
    Markovic, Romana
    Ferrando, Martina
    Frisch, Jerome
    Causone, Francesco
    van Treeck, Christoph
    APPLIED ENERGY, 2023, 334
  • [10] Combining attention with spectrum to handle missing values on time series data without imputation
    Chen, Yen -Pin
    Huang, Chien-Hua
    Lo, Yuan-Hsun
    Chen, Yi-Ying
    Lai, Feipei
    INFORMATION SCIENCES, 2022, 609 : 1271 - 1287