Diabetes Prediction using SMOTE and Machine Learning

被引:0
|
作者
Sarayu, Maganti Khyathi [1 ]
Bhanu, Shaik Ayesha [1 ]
Deekshitha, Karanam [1 ]
Meghana, Maduri [1 ]
Joseph, Iwin Thanakumar [1 ]
机构
[1] Koneru Lakshmaiah Educ Fdn, Dept Comp Sci & Engn, Vaddeswaram, Andhra Pradesh, India
关键词
Diabetes Prediction; PIMA Dataset; Random Forest; Model Tuning; Data Preprocessing; Stratified Sampling; Class Imbalance; Performance Metrics; Machine Learning;
D O I
10.1109/ICICI62254.2024.00011
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This research work explores highly sophisticated diabetes prediction algorithms employing the PIMA Indian Diabetes dataset. Proposed research intends to explore the influence of model update, assessment criteria, and data preparation on prediction algorithms. In this extensive research, a pre-selected dataset coupled with feature scaling, stratified selection, and oversampling is employed to tackle the issue of class imbalance. Through the use of advanced machine learning models like Random Forest, the research illustrates how modifying a component's features might enhance estimate accuracy. Using stratified shuffle split validation, the performance of the model is examined and discover large gains in accuracy, F-measure, precision, recall, and AUC. Proposed work underlines the necessity of data preparation for accurate diabetes prognosis and offers an example of outstanding Random Forest model construction.
引用
收藏
页码:15 / 20
页数:6
相关论文
共 50 条
  • [31] Ecg Classification using Machine Learning Techniques and Smote Oversampling Technique
    Zhong, Zhang Xing
    Michael, Akotonou J.
    Lun, Zhao Jie
    Yue, Dong Hong
    PROCEEDINGS OF 2020 2ND INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MACHINE VISION AND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION AND MACHINE LEARNING, IPMV 2020, 2020, : 10 - 13
  • [32] Data augmentation using SMOTE technique: Application for prediction of burst pressure of hydrocarbons pipeline using supervised machine learning models
    Soomro, Afzal Ahmed
    Mokhtar, Ainul Akmar
    Muhammad, Masdi B.
    Saad, Mohamad Hanif Md
    Lashari, Najeebullah
    Hussain, Muhammad
    Palli, Abdul Sattar
    RESULTS IN ENGINEERING, 2024, 24
  • [33] Predicting stock splits using ensemble machine learning and SMOTE oversampling
    Li, Ang
    Liu, Mark
    Sheather, Simon
    PACIFIC-BASIN FINANCE JOURNAL, 2023, 78
  • [34] Swift Imbalance Data Classification using SMOTE and Extreme Learning Machine
    Rustogi, Rishabh
    Prasad, Ayush
    2019 SECOND INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN DATA SCIENCE (ICCIDS 2019), 2019,
  • [35] Machine-Learning-Based Diabetes Prediction Using Multisensor Data
    Site, Aditi
    Nurmi, Jari
    Lohan, Elena Simona
    IEEE SENSORS JOURNAL, 2023, 23 (22) : 28370 - 28377
  • [36] Diabetes prediction using machine learning classifiers with oversampling and feature augmentation
    Banday, Mehroush
    Zafar, Sherin
    Agarwal, Parul
    Alam, M. Afshar
    JOURNAL OF STATISTICS AND MANAGEMENT SYSTEMS, 2024, 27 (02) : 455 - 464
  • [37] Development of Various Diabetes Prediction Models Using Machine Learning Techniques
    Shin, Juyoung
    Kim, Jaewon
    Lee, Chanjung
    Yoon, Joon Young
    Kim, Seyeon
    Song, Seungjae
    Kim, Hun-Sung
    DIABETES & METABOLISM JOURNAL, 2022, 46 (04) : 650 - 657
  • [38] IDMPF: intelligent diabetes mellitus prediction framework using machine learning
    Ismail, Leila
    Materwala, Huned
    APPLIED COMPUTING AND INFORMATICS, 2025, 21 (1/2) : 78 - 89
  • [39] A remote healthcare monitoring framework for diabetes prediction using machine learning
    Ramesh, Jayroop
    Aburukba, Raafat
    Sagahyroon, Assim
    HEALTHCARE TECHNOLOGY LETTERS, 2021, 8 (03) : 45 - 57
  • [40] Diabetes Disease Prediction using Machine Learning on Big Data of Healthcare
    Mir, Ayman
    Dhage, Sudhir N.
    2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,