Different Scales of Medical Data Classification Based on Machine Learning Techniques: A Comparative Study

被引:5
|
作者
Elzeheiry, Heba Aly [1 ]
Barakat, Sherief [1 ]
Rezk, Amira [1 ]
机构
[1] Mansoura Univ, Fac Comp & Informat, Dept Informat Syst, Mansoura 35516, Egypt
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 02期
关键词
medical big data; naive bayes (NB); linear model (LM); regression (R); decision tree (DT); random forest (RF); gradient boosted tree (GBT); J48; correlation feature selection (CFS); PREDICTION; PROMISE; DISEASE;
D O I
10.3390/app12020919
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In recent years, medical data have vastly increased due to the continuous generation of digital data. The different forms of medical data, such as reports, textual, numerical, monitoring, and laboratory data generate the so-called medical big data. This paper aims to find the best algorithm which predicts new medical data with high accuracy, since good prediction accuracy is essential in medical fields. To achieve the study's goal, the best accuracy algorithm and least processing time algorithm are defined through an experiment and comparison of seven different algorithms, including Naive bayes, linear model, regression, decision tree, random forest, gradient boosted tree, and J48. The conducted experiments have allowed the prediction of new medical big data that reach the algorithm with the best accuracy and processing time. Here, we find that the best accuracy classification algorithm is the random forest with accuracy values of 97.58%, 83.59%, and 90% for heart disease, M-health, and diabetes datasets, respectively. The Naive bayes has the lowest processing time with values of 0.078, 7.683, and 22.374 s for heart disease, M-health, and diabetes datasets, respectively. In addition, the best result of the experiment is obtained by the combination of the CFS feature selection algorithm with the Random Forest classification algorithm. The results of applying RF with the combination of CFS on the heart disease dataset are as follows: Accuracy of 90%, precision of 83.3%, sensitivity of 100, and consuming time of 3 s. Moreover, the results of applying this combination on the M-health dataset are as follows: Accuracy of 83.59%, precision of 74.3%, sensitivity of 93.1, and consuming time of 13.481 s. Furthermore, the results on the diabetes dataset are as follows: Accuracy of 97.58%, precision of 86.39%, sensitivity of 97.14, and consuming time of 56.508 s.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Medical Image Classification Based on Machine Learning Techniques
    Pathan, Naziya
    Jadhav, Mukti E.
    ADVANCED INFORMATICS FOR COMPUTING RESEARCH, PT I, 2019, 1075 : 91 - 101
  • [2] Machine Learning Techniques for Diabetes Classification: A Comparative Study
    Mustafa, Hiri
    Mohamed, Chrayah
    Nabil, Ourdani
    Noura, Aknin
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (09) : 785 - 790
  • [3] Comparative study of machine learning techniques based on TQWT for EMG signal classification
    Abdel-Maboud, Nahla F.
    Parusheva, Silvia Stoyanova
    Alfonse, Marco
    Salem, Abdel-Badeeh M.
    5TH INTERNATIONAL CONFERENCE ON COMPUTING AND INFORMATICS (ICCI 2022), 2022, : 374 - 377
  • [4] Review on Machine Learning Techniques for Medical Data Classification and Disease Diagnosis
    Saturi, Swapna
    REGENERATIVE ENGINEERING AND TRANSLATIONAL MEDICINE, 2023, 9 (02) : 141 - 164
  • [5] Review on Machine Learning Techniques for Medical Data Classification and Disease Diagnosis
    Swapna Saturi
    Regenerative Engineering and Translational Medicine, 2023, 9 : 141 - 164
  • [6] Medical and Health Data Classification Method Based on Machine Learning
    Zeng, Yu
    Cheng, Fuchao
    JOURNAL OF HEALTHCARE ENGINEERING, 2021, 2021
  • [7] Medical and Health Data Classification Method Based on Machine Learning
    Zeng, Yu
    Cheng, Fuchao
    JOURNAL OF HEALTHCARE ENGINEERING, 2021, 2021
  • [8] A Comparative Study of ECG Beats Variability Classification Based on Different Machine Learning Algorithms
    Agya Ram Verma
    Bhumika Gupta
    Chitra Bhandari
    Augmented Human Research, 2020, 5 (1)
  • [9] Comparative Study of Various Machine Learning Classifiers on Medical Data
    Karankar, Nilima
    Shukla, Pragya
    Agrawal, Niyati
    2017 7TH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2017, : 267 - 271
  • [10] Predicting Heart Diseases Using Machine Learning and Different Data Classification Techniques
    El-Sofany, Hosam F.
    IEEE ACCESS, 2024, 12 : 106146 - 106160