PRFE-driven gene selection with multi-classifier ensemble for cancer classification

被引:0
|
作者
Behuria, Smitirekha [1 ]
Swain, Sujata [1 ]
Bandyopadhyay, Anjan [1 ]
Al-Sadoon, Mohammad Khalid [2 ]
Mallik, Saurav [3 ,4 ]
机构
[1] Kalinga Inst Ind Technol, Sch Comp Engn, Bhubaneswar 751024, Odisha, India
[2] King Saud Univ, Coll Sci, Dept Zool, POB 2455, Riyadh 11451, Saudi Arabia
[3] Harvard TH Chan Sch Publ Hlth, Dept Environm Hlth, Boston, MA 02115 USA
[4] Univ Arizona, Dept Pharmacol & Toxicol, Tucson, MA 85721 USA
关键词
Principal recursive feature eliminator (PRFE); Recursive feature elimination; Long short-term memory; LightGBM; CatBoost; Convolutional neural network; Gene expression analysis; BREAST-CANCER; EXPRESSION; ALGORITHM;
D O I
10.1016/j.eij.2025.100637
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this era, cancer remains a paramount concern due to its pervasive impact on individuals and societies, persistent challenges in treatment and prevention, and the ongoing need for global collaboration and innovation to improve outcomes and reduce its burden. Cancer marked by uncontrolled cell growth is a leading global cause of mortality, necessitating advanced methods for accurate diagnosis. This study introduces an innovative unsupervised feature selection technique Principal Recursive Feature Eliminator (PRFE) for selection of genes and cancer classification. Subsequently, seven different classifiers-Support Vector Machine, Random Forest, CatBoost, Light Gradient Boosting Method, Artificial Neural Network, Convolutional Neural Network, Long Short-Term Memory are used to increase the model's robustness. The proposed approach is evaluated on nine benchmark gene expression datasets with a combination of two different algorithms. A series of experiments are conducted to assess the proposed method, focusing on the selected features and identifying optimal classifiers. We have calculated F1-Score, accuracy, recall, and precision. The suggested strategy performs better than expected, as the results highlight its potential to improve cancer classification techniques with an accuracy of 99.98%. We conclude from this analysis that, across many datasets, the CatBoost and CNN model outperforms the other models. This research contributes to the ongoing efforts to improve diagnostic precision and treatment outcomes in cancer research.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] A Multi-Classifier Approach to MUAP Classification for Diagnosis of Neuromuscular Disorders
    Kamali, Tahereh
    Boostani, Reza
    Parsaei, Hossein
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2014, 22 (01) : 191 - 200
  • [22] Combining Multi-classifier with CNN in Detection and Classification of Breast Calcification
    Chen, Kuan-Chun
    Chin, Chiun-Li
    Chung, Ni-Chuan
    Hsu, Chin-Luen
    FUTURE TRENDS IN BIOMEDICAL AND HEALTH INFORMATICS AND CYBERSECURITY IN MEDICAL DEVICES, ICBHI 2019, 2020, 74 : 304 - 311
  • [23] Gene Selection for Cancer Classification through Ensemble of Methods
    Wilinski, Artur
    Osowski, Stanislaw
    Siwek, Krzysztof
    ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, 2009, 5495 : 507 - +
  • [24] Multi-Classifier Classification of Spam Email on an Ubiquitous Multi-Core Architecture
    Islam, Md. Rafiqul
    Singh, Jaipal
    Chonka, Ashley
    Zhou, Wanlei
    2008 IFIP INTERNATIONAL CONFERENCE ON NETWORK AND PARALLEL COMPUTING, PROCEEDINGS, 2008, : 210 - 217
  • [25] A Multi-Classifier for DDoS Attacks Using Stacking Ensemble Deep Neural Network
    Sayed, Moinul Islam
    Sayem, Ibrahim Mohammed
    Saha, Sajal
    Haque, Anwar
    2022 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2022, : 1125 - 1130
  • [26] Prediction of diabetes disease using an ensemble of machine learning multi-classifier models
    Karlo Abnoosian
    Rahman Farnoosh
    Mohammad Hassan Behzadi
    BMC Bioinformatics, 24
  • [27] Prediction of diabetes disease using an ensemble of machine learning multi-classifier models
    Abnoosian, Karlo
    Farnoosh, Rahman
    Behzadi, Mohammad Hassan
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [28] An intensity-region driven multi-classifier scheme for improving the classification accuracy of proteomic MS-spectra
    Bougioukos, Panagiotis
    Glotsos, Dimitris
    Cavouras, Dionisis
    Daskalakis, Antonis
    Kalatzis, Ioannis
    Kostopoulos, Spiros
    Nikiforidis, George
    Bezerianos, Anastasios
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2010, 99 (02) : 147 - 153
  • [29] A Multi-classifier and Decision Fusion Framework for Robust Classification of Mammographic Masses
    Prasad, Saurabh
    Bruce, Lori Mann
    Ball, John E.
    2008 30TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-8, 2008, : 3048 - +
  • [30] Interactive patent classification based on multi-classifier fusion and active learning
    Zhang, Xiaoyu
    Neurocomputing, 2014, 127 : 200 - 205