Determining the optimal ridge parameter in logistic regression

被引:0
|
作者
Phrueksawatnon, Piyada [1 ]
Jitthavech, Jirawan [1 ]
Lorchirachoonkul, Vichit [1 ]
机构
[1] Natl Inst Dev Adm, Sch Appl Stat, 118 Serithai Rd, Bangkok 10240, Thailand
关键词
Bounds of the ridge parameter; Efficiency; Logistic regression; Multicollinearity; Simulation; BIASED-ESTIMATION;
D O I
10.1080/03610918.2019.1626890
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A closed interval based on the eigenvalues of the explanatory variables in the dataset is analytically derived to contain the ridge parameter that minimizes the mean squared error (MSE) of the coefficient estimators in a logistic regression model. After specifying the required accuracy, a Fibonacci search can efficiently locate the optimal ridge parameter within such a closed interval. Based on a simulation comprising 2,000 replications of three sample sizes (100, 200, and 1,000) from a logistic regression model consisting of two correlated variables with correlation coefficients of 0.90, 0.95, and 0.99, and one independent variable, it is confirmed that, using the true mean squared error criterion, the relative efficiency of the estimator with the optimal ridge parameter is clearly higher than those of estimators using six commonly used ridge estimators. Finally, using a real-life data set of small size and changing the criterion to the asymptotic mean squared error, comparisons with the same six estimators show that the relative efficiency of the estimator with the optimal ridge parameter is better than or equal to others.
引用
收藏
页码:3569 / 3580
页数:12
相关论文
共 50 条