Detection of colon cancer based on microarray dataset using machine learning as a feature selection and classification techniques

被引:22
|
作者
Shafi, A. S. M. [1 ,2 ]
Molla, M. M. Imran [2 ]
Jui, Julakha Jahan [3 ]
Rahman, Mohammad Motiur [1 ]
机构
[1] Mawlana Bhashani Sci & Technol Univ, Dept Comp Sci & Engn, Tangail 1902, Bangladesh
[2] Khwaja Yunus Ali Univ, Fac Comp Sci & Engn, Sirajgonj 6751, Bangladesh
[3] Univ Malaysia Pahang, Fac Elect & Elect Engn, Pekan 26600, Pahang, Malaysia
来源
SN APPLIED SCIENCES | 2020年 / 2卷 / 07期
关键词
Colon cancer; Microarray data; Feature selection; Machine learning; Random forest; Cross validation; PARTICLE SWARM OPTIMIZATION; GENE; PREDICTION;
D O I
10.1007/s42452-020-3051-2
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Microarray data is an increasingly important tool for providing information on gene expression for analysis and interpretation. Researchers attempt to utilize the smallest possible set of relevant gene expression profiles in most gene expression studies to enhance tumor identification accuracy. This research aims to analyze and predicts colon cancer data employing a machine learning approach and feature selection technique based on a random forest classifier. More particularly, our proposed method can reduce the burden of high dimensional data and allow faster calculations by combining the "Mean Decrease Accuracy" and "Mean Decrease Gini" as feature selection methods into a renowned classifier namely Random Forest, with the aim of increasing the prediction model's accuracy level. In addition, we have also shown a comparative model analysis with selection of features and model without selection of features. The extensive experimental results have demonstrated that the proposed model with feature selection is favorable and effective which triumphs the best performance of accuracy.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Performance Analysis of Anomaly-Based Network Intrusion Detection Using Feature Selection and Machine Learning Techniques
    Seniaray, Sumedha
    Jindal, Rajni
    WIRELESS PERSONAL COMMUNICATIONS, 2024, 138 (04) : 2321 - 2351
  • [42] Connected Devices Classification using Feature Selection with Machine Learning
    Fagroud, Fatima Zahra
    Toumi, Hicham
    Lahmar, El Habib Ben
    Achtaich, Khadija
    El Filali, Sanaa
    Baddi, Youssef
    IAENG International Journal of Computer Science, 2022, 49 (02)
  • [43] Feature Selection for Text Classification Using Machine Learning Approaches
    Thirumoorthy, K.
    Muneeswaran, K.
    NATIONAL ACADEMY SCIENCE LETTERS-INDIA, 2022, 45 (01): : 51 - 56
  • [44] Feature Selection for Text Classification Using Machine Learning Approaches
    K. Thirumoorthy
    K. Muneeswaran
    National Academy Science Letters, 2022, 45 : 51 - 56
  • [45] Fish Classification Based on Robust Features Selection Using Machine Learning Techniques
    Hnin, Than Thida
    Lynn, Khin Thidar
    GENETIC AND EVOLUTIONARY COMPUTING, VOL I, 2016, 387 : 237 - 245
  • [46] Filter-Based Feature Selection and Machine-Learning Classification of Cancer Data
    Farsi, Mohammed
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2021, 28 (01): : 83 - 92
  • [47] Evolutionary feature selection for machine learning based malware classification
    Kale, Gulsade
    Bostanci, Gazi Erkan
    Celebi, Fatih Vehbi
    ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH, 2024, 56
  • [48] Android malware detection applying feature selection techniques and machine learning
    Keyvanpour, Mohammad Reza
    Shirzad, Mehrnoush Barani
    Heydarian, Farideh
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (06) : 9517 - 9531
  • [49] Android malware detection applying feature selection techniques and machine learning
    Mohammad Reza Keyvanpour
    Mehrnoush Barani Shirzad
    Farideh Heydarian
    Multimedia Tools and Applications, 2023, 82 : 9517 - 9531
  • [50] Kernel PCA and SVM-RFE Based Feature Selection for Classification of Dengue Microarray Dataset
    Octaria, Elke Annisa
    Siswantining, Titin
    Bustamam, Alhadi
    Sarwinda, Devvi
    SYMPOSIUM ON BIOMATHEMATICS 2019 (SYMOMATH 2019), 2020, 2264