Development of Imputation Methods for Missing Data in Multiple Linear Regression Analysis

被引：3

作者：

Thongsri, Thidarat ^{[1
]}

Samart, Klairung ^{[1
]}

机构：

[1] Prince Songkla Univ, Fac Sci, Div Computat Sci, Stat & Applicat Res Unit, Hat Yai, Thailand

来源：

LOBACHEVSKII JOURNAL OF MATHEMATICS | 2022年 / 43卷 / 11期

关键词：

missing data; imputation method; composite method; multiple linear regression; HOT DECK IMPUTATION;

D O I：

10.1134/S1995080222140323

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Missing data is a common issue in many domains of study. If this issue is disregarded, the erroneous conclusion may be reached. This study's objective is to develop and compared the efficiency of eight imputation methods: hot deck imputation (HD), k-nearest neighbors imputation (KNN), stochastic regression, imputation (SR), predictive mean matching imputation (PMM), random forest imputation (RF), stochastic regression random forest with equivalent weight imputation (SREW), k-nearest random forest with equivalent weight imputation (KREW), and k-nearest stochastic regression and random forest with equivalent weight imputation (KSREW). In this study, the simulation was run using sample sizes of 30, 60, 100, and 150, and missing percentages of 10%, 20%, 30%, and 40%. The average mean square error (AMSE) was used to compare efficiency. The results reveal that the proposed composite approaches outperformed the single ones, particularly a three-component method called KSREW. Increasing the number of components to a four-component method, on the other hand, has no effect on imputation performance.

引用

页码：3390 / 3399

页数：10

共 50 条

[21] Full Information Multiple Imputation for Linear Regression Model with Missing Response Variable
Song, Limin
Guo, Guangbao
IAENG International Journal of Applied Mathematics, 2024, 54 (01) : 77 - 81
[22] Analysis of Longitudinal Clinical Trials with Missing Data Using Multiple Imputation in Conjunction with Robust Regression
Mehrotra, Devan V.
Li, Xiaoming
Liu, Jiajun
Lu, Kaifeng
BIOMETRICS, 2012, 68 (04) : 1250 - 1259
[23] The performance of multiple imputation for missing covariate data within the context of regression relative survival analysis
Giorgi, Roch
Belot, Aurelien
Gaudart, Jean
Launoy, Guy
STATISTICS IN MEDICINE, 2008, 27 (30) : 6310 - 6331
[24] Linear regression for bivariate censored data via multiple imputation
Pan, W
Kooperberg, C
STATISTICS IN MEDICINE, 1999, 18 (22) : 3111 - 3121
[25] A multiple imputation approach to linear regression with clustered censored data
Pan, W
Connett, JE
LIFETIME DATA ANALYSIS, 2001, 7 (02) : 111 - 123
[26] A Multiple Imputation Approach to Linear Regression with Clustered Censored Data
Wei Pan
John E. Connett
Lifetime Data Analysis, 2001, 7 : 111 - 123
[27] Cox regression analysis with missing covariates via nonparametric multiple imputation
Hsu, Chiu-Hsieh
Yu, Mandi
STATISTICAL METHODS IN MEDICAL RESEARCH, 2019, 28 (06) : 1676 - 1688
[28] A Comparison of Estimation Methods for Missing Data in Multiple Linear Regression with Two Independent Variables
Suraphee, Sujitta
Raksmanee, Chancharoen
Busaba, Jaruchat
Chaisorn, Chanchai
Nakornthai, Wilaiwan
THAILAND STATISTICIAN, 2006, 4 : 13 - 26
[29] MICROARRAY MISSING DATA IMPUTATION USING REGRESSION
Bayrak, Tuncay
Ogul, Hasan
2017 13TH IASTED INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING (BIOMED), 2017, : 68 - 73
[30] Missing Value Imputation via Clusterwise Linear Regression
Karmitsa, Napsu
Taheri, Sona
Bagirov, Adil
Makinen, Pauliina
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (04) : 1889 - 1901

← 1 2 3 4 5 →