Enhancing accuracy in modelling highly multicollinear data using alternative shrinkage parameters for ridge regression methods

被引:0
|
作者
Akhtar, Nadeem [1 ]
Alharthi, Muteb Faraj [2 ]
机构
[1] Govt Degree Coll, Achin Payan Higher Educ Dept, Peshawar, Khyber Pakhtunk, Pakistan
[2] Taif Univ, Coll Sci, Dept Math & Stat, Taif, Saudi Arabia
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
关键词
Regression analysis; Multicollinearity; Shrinkage ridge estimators; Mean squared error (MSE); Monte-Carlo simulations; MONTE-CARLO;
D O I
10.1038/s41598-025-94857-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this study, we introduce three new shrinkage parameters for ridge regression, which dynamically adjust the ridge penalty based on the properties of the data, particularly the multicollinearity structure. Using these new parameters, we develop three ridge condition-adjusted estimators (CAREs), referred to as CARE1, CARE2, and CARE3, which specifically designed to enhance predictive accuracy in datasets with significant multicollinearity and high error variance. The performance of the developed shrinkage estimators is rigorously evaluated through extensive simulation studies, using the Mean Square Error (MSE) criterion for accuracy assessment. The simulation results reveal that our proposed estimators consistently outperform existing estimators under different scenarios. We also apply these estimators to a real-world dataset to demonstrate their practical effectiveness, thereby showcasing their applicability in real-life data analysis. The real-world application further validates their practical utility for accurate prediction and model stability in complex scenarios in which the CARE3 emerged as the best-performing shrinkage estimator.
引用
收藏
页数:13
相关论文
共 8 条
  • [1] RIDGE-REGRESSION - AN ALTERNATIVE TO MULTIPLE LINEAR-REGRESSION FOR HIGHLY CORRELATED DATA
    NEWELL, GJ
    LEE, B
    JOURNAL OF FOOD SCIENCE, 1981, 46 (03) : 968 - 969
  • [2] Enhancing Parameters Tuning of Overlay Models with Ridge Regression: Addressing Multicollinearity in High-Dimensional Data
    Magklaras, Aris
    Gogos, Christos
    Alefragis, Panayiotis
    Birbas, Alexios
    MATHEMATICS, 2024, 12 (20)
  • [3] Patching rainfall data using regression methods .2. Comparisons of accuracy, bias and efficiency
    Makhuvha, T
    Pegram, G
    Sparks, R
    Zucchini, W
    JOURNAL OF HYDROLOGY, 1997, 198 (1-4) : 308 - 318
  • [4] Inference about regression parameters using highly stratified survey count data with over-dispersion and repeated measurements
    Wang, S.
    Cadigan, N. G.
    Benoit, H. P.
    JOURNAL OF APPLIED STATISTICS, 2017, 44 (06) : 1013 - 1030
  • [5] From simple linear regression to machine learning methods: Canopy cover modelling of a young forest using planet data
    Gyawali, Arun
    Adhikari, Hari
    Aalto, Mika
    Ranta, Tapio
    ECOLOGICAL INFORMATICS, 2024, 82
  • [6] Use of machine learning methods for modelling mechanical parameters of PLA and PLA/native potato starch compound using aging data
    Reit, Margarita
    Lu, Xu
    Zarges, Jan-Christoph
    Heim, Hans-Peter
    INTERNATIONAL POLYMER PROCESSING, 2025,
  • [7] Assessing the stability of parameters estimation and prediction accuracy in regression methods for estimating seed oil content in Brassica napus L. using NIR spectroscopy.
    Olivos-Trujillo, Marcos
    Gajardo, Humberto A.
    Salvo, Sonia
    Gonzalez, Anibal
    Munoz, Carlos
    2015 CHILEAN CONFERENCE ON ELECTRICAL, ELECTRONICS ENGINEERING, INFORMATION AND COMMUNICATION TECHNOLOGIES (CHILECON), 2015, : 25 - 30
  • [8] Is the current pertussis incidence only the results of testing? A spatial and space-time analysis of pertussis surveillance data using cluster detection methods and geographically weighted regression modelling
    Kauhl, Boris
    Heil, Jeanne
    Hoebe, Christian J. P. A.
    Schweikart, Juergen
    Krafft, Thomas
    Dukers-Muijrers, Nicole H. T. M.
    PLOS ONE, 2017, 12 (03):