An ensemble learning approach for diabetes prediction using boosting techniques

被引:11
|
作者
Ganie, Shahid Mohammad [1 ]
Pramanik, Pijush Kanti Dutta [2 ]
Malik, Majid Bashir [3 ]
Mallik, Saurav [4 ]
Qin, Hong [5 ]
机构
[1] Woxsen Univ, AI Res Ctr, Sch Business, Hyderabad, India
[2] Galgotias Univ, Sch Comp Applicat & Technol, Greater Noida, India
[3] Baba Ghulam Shah Badshah Univ, Dept Comp Sci, Rajauri, India
[4] Harvard Univ, Sch Publ Hlth, Dept Environm Hlth, Boston, MA 02138 USA
[5] Univ Tennessee Chattanooga, Coll Engn & Comp Sci, Chattanooga, TN 37403 USA
关键词
diabetes prediction; ensemble learning; XGBoost; CatBoost; LightGBM; AdaBoost; gradient boost;
D O I
10.3389/fgene.2023.1252159
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Introduction: Diabetes is considered one of the leading healthcare concerns affecting millions worldwide. Taking appropriate action at the earliest stages of the disease depends on early diabetes prediction and identification. To support healthcare providers for better diagnosis and prognosis of diseases, machine learning has been explored in the healthcare industry in recent years.Methods: To predict diabetes, this research has conducted experiments on five boosting algorithms on the Pima diabetes dataset. The dataset was obtained from the University of California, Irvine (UCI) machine learning repository, which contains several important clinical features. Exploratory data analysis was used to identify the characteristics of the dataset. Moreover, upsampling, normalisation, feature selection, and hyperparameter tuning were employed for predictive analytics.Results: The results were analysed using various statistical/machine learning metrics and k-fold cross-validation techniques. Gradient boosting achieved the greatest accuracy rate of 92.85% among all the classifiers. Precision, recall, f1-score, and receiver operating characteristic (ROC) curves were used to further validate the model.Discussion: The suggested model outperformed the current studies in terms of prediction accuracy, demonstrating its applicability to other diseases with similar predicate indications.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Software Fault Prediction Using an RNN-Based Deep Learning Approach and Ensemble Machine Learning Techniques
    Borandag, Emin
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [22] An Ensemble Approach for Prediction of Cardiovascular Disease Using Meta Classifier Boosting Algorithms
    Patro, Sibo Prasad
    Padhy, Neelamadhab
    Sah, Rahul Deo
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2022, 18 (01)
  • [23] Early Prediction of Diabetes Using an Ensemble of Machine Learning Models
    Dutta, Aishwariya
    Hasan, Md Kamrul
    Ahmad, Mohiuddin
    Awal, Md Abdul
    Islam, Md Akhtarul
    Masud, Mehedi
    Meshref, Hossam
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (19)
  • [24] Delamination localization in the composite thin plates using ensemble learning: Bagging and boosting techniques
    Das, O.
    Das, D. B.
    SCIENTIA IRANICA, 2024, 31 (04) : 310 - 329
  • [25] Diabetes prediction model using machine learning techniques
    Modak, Sandip Kumar Singh
    Jha, Vijay Kumar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (13) : 38523 - 38549
  • [26] Diabetes prediction model using machine learning techniques
    Sandip Kumar Singh Modak
    Vijay Kumar Jha
    Multimedia Tools and Applications, 2024, 83 : 38523 - 38549
  • [27] DIABETES TWITTER ANALYSIS USING IMPROVED ENSEMBLE MACHINE LEARNING TECHNIQUES
    Prabha, V. Diviya
    Rathipriya, R.
    ADVANCES AND APPLICATIONS IN MATHEMATICAL SCIENCES, 2021, 21 (01): : 241 - 250
  • [28] Performance prediction of impact hammer using ensemble machine learning techniques
    Ocak, Ibrahim
    Seker, Sadi Evren
    Rostami, Jamal
    TUNNELLING AND UNDERGROUND SPACE TECHNOLOGY, 2018, 80 : 269 - 276
  • [29] Enhanced slope stability prediction using ensemble machine learning techniques
    Yadav, Devendra Kumar
    Chattopadhyay, Swarup
    Tripathy, Debi Prasad
    Mishra, Pragyan
    Singh, Pritiranjan
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [30] Improved prediction of software defects using ensemble machine learning techniques
    Sweta Mehta
    K. Sridhar Patnaik
    Neural Computing and Applications, 2021, 33 : 10551 - 10562