Augmenting energy time-series for data-efficient imputation of missing values

被引:17
|
作者
Liguori, Antonio [1 ]
Markovic, Romana [2 ]
Ferrando, Martina [3 ]
Frisch, Jerome [1 ]
Causone, Francesco [3 ]
van Treeck, Christoph [1 ]
机构
[1] Rhein Westfal TH Aachen, E3D Inst Energy Efficiency & Sustainable Bldg, Mathieustr 30, D-52074 Aachen, Germany
[2] Karlsruhe Inst Technol, Bldg Sci Grp, Englerstr 7, D-76131 Karlsruhe, Germany
[3] Politecn Milan, Dept Energy, Via Lambruschini 4, I-20156 Milan, Italy
关键词
Missing data; Data augmentation; Data scarcity; Building energy data; Deep learning; REPRESENTATIONS; NETWORK;
D O I
10.1016/j.apenergy.2023.120701
中图分类号
TE [石油、天然气工业]; TK [能源与动力工程];
学科分类号
0807 ; 0820 ;
摘要
This study explores the applicability of data augmentation techniques for reconstructing missing energy time -series in limited data regimes. In particular, multiple synthetic copies of a relatively small training dataset are stacked together with pseudo-random noise. First, an existing convolutional denoising autoencoder is selected from a previous work, as the base imputation model of this study. Then, an optimal augmentation rate, which minimizes the training set of the model, is chosen based on the preliminary results obtained from one building. The results proved that, augmenting 80 times a nine days-long training set could reduce the initial average root mean squared error (RMSE) by 37% and 48%, for continuous and random missing scenarios. Additionally, the augmented model outperformed the benchmark methods with 23% and 12% lower average RMSE. No additional tuning or calibration costs were required for the existing base imputation model. Therefore, the presented data augmentation technique could significantly reduce the expensive computational costs associated with deep learning models.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Missing values imputation in ocean buoy time series data
    Chakraborty, Samarpan
    Ide, Kayo
    Balachandran, Balakumar
    OCEAN ENGINEERING, 2025, 318
  • [2] Visual Imputation Analytics for Missing Time-Series Data in Bayesian Network
    Yeon, Hanbyul
    Son, Hyesook
    Jang, Yun
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2020), 2020, : 303 - 310
  • [3] INTERPOLATING MISSING VALUES IN A TIME-SERIES
    DAMSLETH, E
    SCANDINAVIAN JOURNAL OF STATISTICS, 1980, 7 (01) : 33 - 39
  • [4] A Review of Missing Values Handling Methods on Time-Series Data
    Pratama, Irfan
    Permanasari, Adhistya Erna
    Ardiyanto, Igi
    Indrayani, Rini
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI), 2016,
  • [5] Deep imputation of missing values in time series health data: A review with benchmarking
    Kazijevs, Maksims
    Samad, Manar D.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 144
  • [6] A novel imputation method for missing values in air pollutant time series data
    Pena, Mario
    Ortega, Patricia
    Orellana, Marcos
    2019 IEEE LATIN AMERICAN CONFERENCE ON COMPUTATIONAL INTELLIGENCE (LA-CCI), 2019, : 99 - 104
  • [7] IMPUTATION FOR CONSECUTIVE MISSING VALUES IN NON-STATIONARY TIME SERIES DATA
    Wongoutong, Chantha
    ADVANCES AND APPLICATIONS IN STATISTICS, 2020, 64 (01) : 87 - 102
  • [8] A bagging algorithm for the imputation of missing values in time series
    Andiojaya, Agung
    Demirhan, Haydar
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 129 : 10 - 26
  • [9] Imputation of Missing Values in Time Series with Lagged Correlations
    Rahman, Shah Atiqur
    Huang, Yuxiao
    Claassen, Jan
    Kleinberg, Samantha
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2014, : 753 - 762
  • [10] Recurrent Imputation for Multivariate Time Series with Missing Values
    Suo, Qiuling
    Yao, Liuyi
    Xun, Guangxu
    Sun, Jianhui
    Zhang, Aidong
    2019 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2019, : 562 - 564