Development of Imputation Methods for Missing Data in Multiple Linear Regression Analysis

被引:3
|
作者
Thongsri, Thidarat [1 ]
Samart, Klairung [1 ]
机构
[1] Prince Songkla Univ, Fac Sci, Div Computat Sci, Stat & Applicat Res Unit, Hat Yai, Thailand
关键词
missing data; imputation method; composite method; multiple linear regression; HOT DECK IMPUTATION;
D O I
10.1134/S1995080222140323
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Missing data is a common issue in many domains of study. If this issue is disregarded, the erroneous conclusion may be reached. This study's objective is to develop and compared the efficiency of eight imputation methods: hot deck imputation (HD), k-nearest neighbors imputation (KNN), stochastic regression, imputation (SR), predictive mean matching imputation (PMM), random forest imputation (RF), stochastic regression random forest with equivalent weight imputation (SREW), k-nearest random forest with equivalent weight imputation (KREW), and k-nearest stochastic regression and random forest with equivalent weight imputation (KSREW). In this study, the simulation was run using sample sizes of 30, 60, 100, and 150, and missing percentages of 10%, 20%, 30%, and 40%. The average mean square error (AMSE) was used to compare efficiency. The results reveal that the proposed composite approaches outperformed the single ones, particularly a three-component method called KSREW. Increasing the number of components to a four-component method, on the other hand, has no effect on imputation performance.
引用
收藏
页码:3390 / 3399
页数:10
相关论文
共 50 条
  • [21] Full Information Multiple Imputation for Linear Regression Model with Missing Response Variable
    Song, Limin
    Guo, Guangbao
    IAENG International Journal of Applied Mathematics, 2024, 54 (01) : 77 - 81
  • [22] Analysis of Longitudinal Clinical Trials with Missing Data Using Multiple Imputation in Conjunction with Robust Regression
    Mehrotra, Devan V.
    Li, Xiaoming
    Liu, Jiajun
    Lu, Kaifeng
    BIOMETRICS, 2012, 68 (04) : 1250 - 1259
  • [23] The performance of multiple imputation for missing covariate data within the context of regression relative survival analysis
    Giorgi, Roch
    Belot, Aurelien
    Gaudart, Jean
    Launoy, Guy
    STATISTICS IN MEDICINE, 2008, 27 (30) : 6310 - 6331
  • [24] Linear regression for bivariate censored data via multiple imputation
    Pan, W
    Kooperberg, C
    STATISTICS IN MEDICINE, 1999, 18 (22) : 3111 - 3121
  • [25] A multiple imputation approach to linear regression with clustered censored data
    Pan, W
    Connett, JE
    LIFETIME DATA ANALYSIS, 2001, 7 (02) : 111 - 123
  • [26] A Multiple Imputation Approach to Linear Regression with Clustered Censored Data
    Wei Pan
    John E. Connett
    Lifetime Data Analysis, 2001, 7 : 111 - 123
  • [27] Cox regression analysis with missing covariates via nonparametric multiple imputation
    Hsu, Chiu-Hsieh
    Yu, Mandi
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2019, 28 (06) : 1676 - 1688
  • [28] A Comparison of Estimation Methods for Missing Data in Multiple Linear Regression with Two Independent Variables
    Suraphee, Sujitta
    Raksmanee, Chancharoen
    Busaba, Jaruchat
    Chaisorn, Chanchai
    Nakornthai, Wilaiwan
    THAILAND STATISTICIAN, 2006, 4 : 13 - 26
  • [29] MICROARRAY MISSING DATA IMPUTATION USING REGRESSION
    Bayrak, Tuncay
    Ogul, Hasan
    2017 13TH IASTED INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING (BIOMED), 2017, : 68 - 73
  • [30] Missing Value Imputation via Clusterwise Linear Regression
    Karmitsa, Napsu
    Taheri, Sona
    Bagirov, Adil
    Makinen, Pauliina
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (04) : 1889 - 1901