Predicting Breast Cancer via Supervised Machine Learning Methods on Class Imbalanced Data

被引:0
|
作者
Rajendran, Keerthana [1 ]
Jayabalan, Manoj [1 ,2 ]
Thiruchelvam, Vinesh [1 ]
机构
[1] Asia Pacific Univ Technol & Innovat, Sch Comp, Kuala Lumpur, Malaysia
[2] Liverpool John Moores Univ, Fac Engn & Technol, Liverpool, Merseyside, England
关键词
Breast cancer; class imbalance; diagnosis; bayesian network; DIAGNOSIS; MODEL; RISK; AGE;
D O I
10.14569/IJACSA.2020.0110808
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A widespread global health concern among women is the incidence of the second most leading cause of fatality which is breast cancer. Predicting the occurrence of breast cancer based on the risk factors will pave the way to an early diagnosis and an efficient treatment in a quicker time. Although there are many predictive models developed for breast cancer in the past, most of these models are generated from highly imbalanced data. The imbalanced data is usually biased towards the majority class but in cancer diagnosis, it is crucial to diagnose the patients with cancer correctly which are oftentimes the minority class. This study attempts to apply three different class balancing techniques namely oversampling (Synthetic Minority Oversampling Technique (SMOTE)), undersampling (SpreadSubsample) and a hybrid method (SMOTE and SpreadSubsample) on the Breast Cancer Surveillance Consortium (BCSC) dataset before constructing the supervised learning methods. The algorithms employed in this study include Naive Bayes, Bayesian Network, Random Forest and Decision Tree (C4.5). The balancing method which yields the best performance across all the four classifiers were tested using the validation data to determine the final predictive model. The performances of the classifiers were evaluated using a Receiver Operating Characteristic (ROC) curve, sensitivity, and specificity.
引用
收藏
页码:54 / 63
页数:10
相关论文
共 50 条
  • [31] Predicting cervical cancer using machine learning methods
    Alsmariy R.
    Healy G.
    Abdelhafez H.
    1600, Science and Information Organization (11): : 173 - 184
  • [32] Predicting and analyzing injury severity: A machine learning-based approach using class-imbalanced proactive and reactive data
    Sarkar, Sobhan
    Pramanik, Anima
    Maiti, J.
    Reniers, Genserik
    SAFETY SCIENCE, 2020, 125
  • [33] Predicting and analyzing injury severity: A machine learning-based approach using class-imbalanced proactive and reactive data
    Sarkar, Sobhan
    Pramanik, Anima
    Maiti, J.
    Reniers, Genserik
    Sarkar, Sobhan (sobhan.sarkar@gmail.com), 1600, Elsevier B.V., Netherlands (125):
  • [34] Predicting Equatorial Spread F at JICAMARCA Sector Via Supervised Machine Learning
    Gao, Shunzu
    Xiong, Chao
    Cai, Hongtao
    Pan, Qian
    Zhan, Weijia
    Zhang, Hong
    Zheng, Yuhao
    SPACE WEATHER-THE INTERNATIONAL JOURNAL OF RESEARCH AND APPLICATIONS, 2025, 23 (03):
  • [35] Comparing supervised and semi-supervised Machine Learning Models on Diagnosing Breast Cancer
    Al-Azzam, Nosayba
    Shatnawi, Ibrahem
    ANNALS OF MEDICINE AND SURGERY, 2021, 62 : 53 - 64
  • [36] Handling imbalanced data in supervised machine learning for lithological mapping using remote sensing and airborne geophysical data
    Nugroho, Hary
    Wikantika, Ketut
    Bijaksana, Satria
    Saepuloh, Asep
    OPEN GEOSCIENCES, 2023, 15 (01)
  • [37] An empirical study of supervised learning methods for breast cancer diseases
    Sivakumar, S.
    Nayak, Soumya Ranjan
    Vidyanandini, S.
    Kumar, J. Ashok
    Palai, G.
    OPTIK, 2018, 175 : 105 - 114
  • [38] Predicting Diabetes Diseases Using Mixed Data and Supervised Machine Learning Algorithms
    Daanouni, Othmane
    Cherradi, Bouchaib
    Tmiri, Amal
    4TH INTERNATIONAL CONFERENCE ON SMART CITY APPLICATIONS (SCA' 19), 2019,
  • [39] Imbalanced Data Problem in Machine Learning: A Review
    Altalhan, Manahel
    Algarni, Abdulmohsen
    Alouane, Monia Turki-Hadj
    IEEE ACCESS, 2025, 13 : 13686 - 13699
  • [40] Machine Learning on Imbalanced Data in Credit Risk
    Birla, Shiivong
    Kohli, Kashish
    Dutta, Akash
    7TH IEEE ANNUAL INFORMATION TECHNOLOGY, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE IEEE IEMCON-2016, 2016,