Detection of colon cancer based on microarray dataset using machine learning as a feature selection and classification techniques

被引:22
|
作者
Shafi, A. S. M. [1 ,2 ]
Molla, M. M. Imran [2 ]
Jui, Julakha Jahan [3 ]
Rahman, Mohammad Motiur [1 ]
机构
[1] Mawlana Bhashani Sci & Technol Univ, Dept Comp Sci & Engn, Tangail 1902, Bangladesh
[2] Khwaja Yunus Ali Univ, Fac Comp Sci & Engn, Sirajgonj 6751, Bangladesh
[3] Univ Malaysia Pahang, Fac Elect & Elect Engn, Pekan 26600, Pahang, Malaysia
来源
SN APPLIED SCIENCES | 2020年 / 2卷 / 07期
关键词
Colon cancer; Microarray data; Feature selection; Machine learning; Random forest; Cross validation; PARTICLE SWARM OPTIMIZATION; GENE; PREDICTION;
D O I
10.1007/s42452-020-3051-2
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Microarray data is an increasingly important tool for providing information on gene expression for analysis and interpretation. Researchers attempt to utilize the smallest possible set of relevant gene expression profiles in most gene expression studies to enhance tumor identification accuracy. This research aims to analyze and predicts colon cancer data employing a machine learning approach and feature selection technique based on a random forest classifier. More particularly, our proposed method can reduce the burden of high dimensional data and allow faster calculations by combining the "Mean Decrease Accuracy" and "Mean Decrease Gini" as feature selection methods into a renowned classifier namely Random Forest, with the aim of increasing the prediction model's accuracy level. In addition, we have also shown a comparative model analysis with selection of features and model without selection of features. The extensive experimental results have demonstrated that the proposed model with feature selection is favorable and effective which triumphs the best performance of accuracy.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Segmented Glioma Classification Using Radiomics-Based Machine Learning: A Comparative Analysis of Feature Selection Techniques
    Jlassi, Amal
    Omri, Amel
    ElBedoui, Khaoula
    Barhoumi, Walid
    AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2023, 2024, 14546 : 425 - 447
  • [22] The Feature Fxtraction and Selection for Electrode Based UHF Partial Discharge Classification Using Different Machine Learning Techniques
    Singh, Nidhi H.
    Kundu, Prasanta
    Chowdhury, Anandita
    2022 IEEE 6TH INTERNATIONAL CONFERENCE ON CONDITION ASSESSMENT TECHNIQUES IN ELECTRICAL SYSTEMS, CATCON, 2022, : 89 - 93
  • [23] Enhancing Software Requirements Classification with Machine Learning and Feature Selection Techniques
    Lanfear, Daniel
    Maleki, Mina
    Banitaan, Shadi
    SOFTWARE AND DATA ENGINEERING, SEDE 2024, 2025, 2244 : 14 - 30
  • [24] Metaheuristic integrated machine learning classification of colon cancer using STFT LASSO and EHO feature extraction from microarray gene expressions
    Nair, Ajin R.
    Rajaguru, Harikumar
    Karthika, M. S.
    Keerthivasan, C.
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [25] Enhancing malware detection with feature selection and scaling techniques using machine learning models
    Hasan, Rakibul
    Biswas, Barna
    Samiun, Md
    Saleh, Mohammad Abu
    Prabha, Mani
    Akter, Jahanara
    Joya, Fatema Haque
    Abdullah, Masuk
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [26] Reviewing various feature selection techniques in machine learning-based botnet detection
    Baruah, Sangita
    Borah, Dhruba Jyoti
    Deka, Vaskar
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (12):
  • [27] Multiclass Classification of Cancer Based on Microarray Data Using Extreme Learning Machine
    Khadijah
    Rismiyati
    Mantau, Aprinaldi Jasa
    2017 1ST INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTATIONAL SCIENCES (ICICOS), 2017, : 159 - 164
  • [28] A Comparative Analysis of Feature Selection Algorithms on Classification of Gene Microarray Dataset
    Jeyachidra, J.
    Punithavalli, M.
    2013 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2013, : 1088 - 1093
  • [29] Feature Selection For An Automated Ancient Tamil Script Classification System Using Machine Learning Techniques
    Suganya, T. S.
    Murugavalli, S.
    2017 INTERNATIONAL CONFERENCE ON ALGORITHMS, METHODOLOGY, MODELS AND APPLICATIONS IN EMERGING TECHNOLOGIES (ICAMMAET), 2017,
  • [30] Machine learning techniques and Chi-square feature selection for cancer classification using SAGE gene expression profiles
    Jin, Xin
    Xu, Anbang
    Bie, Rongfang
    Guo, Ping
    DATA MINING FOR BIOMEDICAL APPLICATIONS, PROCEEDINGS, 2006, 3916 : 106 - 115