Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta

被引:21
|
作者
Maurya, Neha Shree [1 ]
Kushwah, Shikha [1 ]
Kushwaha, Sandeep [2 ]
Chawade, Aakash [1 ,3 ]
Mani, Ashutosh [1 ]
机构
[1] Motilal Nehru Natl Inst Technol Allahabad, Dept Biotechnol, Prayagraj 211004, India
[2] Natl Inst Anim Biotechnol, Hyderabad 500032, India
[3] Swedish Univ Agr Sci, Dept Plant Breeding, S-23053 Alnarp, Sweden
关键词
GLUCAGON-LIKE PEPTIDE-2; CANCER; EXPRESSION; GLP-2; SURVIVAL; PATHWAY; INSULIN; GROWTH;
D O I
10.1038/s41598-023-33327-4
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Colorectal cancer (CRC) is the third most prevalent cancer type and accounts for nearly one million deaths worldwide. The CRC mRNA gene expression datasets from TCGA and GEO (GSE144259, GSE50760, and GSE87096) were analyzed to find the significant differentially expressed genes (DEGs). These significant genes were further processed for feature selection through boruta and the confirmed features of importance (genes) were subsequently used for ML-based prognostic classification model development. These genes were analyzed for survival and correlation analysis between final genes and infiltrated immunocytes. A total of 770 CRC samples were included having 78 normal and 692 tumor tissue samples. 170 significant DEGs were identified after DESeq2 analysis along with the topconfects R package. The 33 confirmed features of importance-based RF prognostic classification model have given accuracy, precision, recall, and f1-score of 100% with 0% standard deviation. The overall survival analysis had finalized GLP2R and VSTM2A genes that were significantly downregulated in tumor samples and had a strong correlation with immunocyte infiltration. The involvement of these genes in CRC prognosis was further confirmed on the basis of their biological function and literature analysis. The current findings indicate that GLP2R and VSTM2A may play a significant role in CRC progression and immune response suppression.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Predictive Model for Human Activity Recognition Based on Machine Learning and Feature Selection Techniques
    Alvaro Patino-Saucedo, Janns
    Patricia Ariza-Colpas, Paola
    Butt-Aziz, Shariq
    Alberto Pineres-Melo, Marlon
    Luis Lopez-Ruiz, Jose
    Cesar Morales-Ortega, Roberto
    De-la-hoz-Franco, Emiro
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (19)
  • [32] Regularization based discriminative feature pattern selection for the classification of Parkinson cases using machine learning
    Kaliyan, Kamalakannan
    Ganesan, Anandharaj
    BIO-ALGORITHMS AND MED-SYSTEMS, 2021, 17 (03) : 181 - 189
  • [33] Heart Diseases Prediction for Optimization based Feature Selection and Classification using Machine Learning Methods
    Rajinikanth, N.
    Pavithra, L.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (02) : 636 - 643
  • [34] Classification of lung cancer using ensemble-based feature selection and machine learning methods
    Cai, Zhihua
    Xu, Dong
    Zhang, Qing
    Zhang, Jiexia
    Ngai, Sai-Ming
    Shao, Jianlin
    MOLECULAR BIOSYSTEMS, 2015, 11 (03) : 791 - 800
  • [35] A Novel Early-Stage Lung Adenocarcinoma Prognostic Model Based on Feature Selection With Orthogonal Regression
    Tang, Binhua
    Wang, Yuqi
    Chen, Yu
    Li, Ming
    Tao, Yongfeng
    FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2021, 8
  • [36] Boruta Feature Selection Method for Optimizing a Case-Based Reasoning Model to Predict Heart Disease
    Gasmi, Safa
    Djebbar, Akila
    Merouani, Hayet Farida
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 37 (14)
  • [37] Automatic colorectal cancer detection using machine learning and deep learning based on feature selection in histopathological images
    Junaid, Hawkar Haji Said
    Daneshfar, Fatemeh
    Mohammad, Mahmud Abdulla
    Biomedical Signal Processing and Control, 2025, 107
  • [38] Development of a Diagnostic Model for Pancreatic Ductal Adenocarcinoma Using Machine Learning and Blood-Based miRNAs
    Tang, Jason Y.
    Kouznetsova, Valentina L.
    Kesari, Santosh
    Tsigelny, Igor F.
    ONCOLOGY, 2025, 103 (03) : 209 - 218
  • [39] Pedagogical Classification Model Based on Machine Learning
    Sebbaq, Hanane
    El Faddouli, Nour-eddine
    EMERGING TRENDS IN INTELLIGENT SYSTEMS & NETWORK SECURITY, 2023, 147 : 363 - 371
  • [40] Automated Spam Detection Using Sandpiper Optimization Algorithm-Based Feature Selection with the Machine Learning Model
    Amutha, T.
    Geetha, S.
    IETE JOURNAL OF RESEARCH, 2024, 70 (02) : 1472 - 1479