Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta

被引:21
|
作者
Maurya, Neha Shree [1 ]
Kushwah, Shikha [1 ]
Kushwaha, Sandeep [2 ]
Chawade, Aakash [1 ,3 ]
Mani, Ashutosh [1 ]
机构
[1] Motilal Nehru Natl Inst Technol Allahabad, Dept Biotechnol, Prayagraj 211004, India
[2] Natl Inst Anim Biotechnol, Hyderabad 500032, India
[3] Swedish Univ Agr Sci, Dept Plant Breeding, S-23053 Alnarp, Sweden
关键词
GLUCAGON-LIKE PEPTIDE-2; CANCER; EXPRESSION; GLP-2; SURVIVAL; PATHWAY; INSULIN; GROWTH;
D O I
10.1038/s41598-023-33327-4
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Colorectal cancer (CRC) is the third most prevalent cancer type and accounts for nearly one million deaths worldwide. The CRC mRNA gene expression datasets from TCGA and GEO (GSE144259, GSE50760, and GSE87096) were analyzed to find the significant differentially expressed genes (DEGs). These significant genes were further processed for feature selection through boruta and the confirmed features of importance (genes) were subsequently used for ML-based prognostic classification model development. These genes were analyzed for survival and correlation analysis between final genes and infiltrated immunocytes. A total of 770 CRC samples were included having 78 normal and 692 tumor tissue samples. 170 significant DEGs were identified after DESeq2 analysis along with the topconfects R package. The 33 confirmed features of importance-based RF prognostic classification model have given accuracy, precision, recall, and f1-score of 100% with 0% standard deviation. The overall survival analysis had finalized GLP2R and VSTM2A genes that were significantly downregulated in tumor samples and had a strong correlation with immunocyte infiltration. The involvement of these genes in CRC prognosis was further confirmed on the basis of their biological function and literature analysis. The current findings indicate that GLP2R and VSTM2A may play a significant role in CRC progression and immune response suppression.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta
    Neha Shree Maurya
    Shikha Kushwah
    Sandeep Kushwaha
    Aakash Chawade
    Ashutosh Mani
    Scientific Reports, 13
  • [2] A diabetes prediction model based on Boruta feature selection and ensemble learning
    Hongfang Zhou
    Yinbo Xin
    Suli Li
    BMC Bioinformatics, 24
  • [3] A diabetes prediction model based on Boruta feature selection and ensemble learning
    Zhou, Hongfang
    Xin, Yinbo
    Li, Suli
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [4] An Improvised Machine Learning Model Based on Mutual Information Feature Selection Approach for Microbes Classification
    Dhindsa, Anaahat
    Bhatia, Sanjay
    Agrawal, Sunil
    Sohi, Balwinder Singh
    ENTROPY, 2021, 23 (02) : 1 - 16
  • [5] HYBRIDIZATION OF MACHINE LEARNING MODEL WITH BEE COLONY BASED FEATURE SELECTION FOR MEDICAL DATA CLASSIFICATION
    Raja, R.
    Ashok, B.
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2024, 25 (06): : 5624 - 5637
  • [6] Development of an early prediction model for vomiting during hemodialysis using LASSO regression and Boruta feature selection
    Chen, Jiajia
    Shen, Cheng
    Xue, Haiyan
    Yuan, Benyin
    Zheng, Bing
    Shen, Lianglan
    Fang, Xingxing
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [7] Design of a Predictor Model for Feature Selection using Machine Learning Approaches
    Pradeep, P.
    Kamalakannan, J.
    JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (05) : 2359 - 2373
  • [8] Gully Erosion Susceptibility Assessment in the Kondoran Watershed Using Machine Learning Algorithms and the Boruta Feature Selection
    Ahmadpour, Hamed
    Bazrafshan, Ommolbanin
    Rafiei-Sardooi, Elham
    Zamani, Hossein
    Panagopoulos, Thomas
    SUSTAINABILITY, 2021, 13 (18)
  • [9] Development and validation of a machine learning prognostic model based on an epigenomic signature in patients with pancreatic ductal adenocarcinoma
    Zaccaria, Gian Maria
    Altini, Nicola
    Mongelli, Valentina
    Marino, Francescomaria
    Bevilacqua, Vitoantonio
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2025, 199
  • [10] Feature selection model for healthcare analysis and classification using classifier ensemble technique
    Nagarajan, Senthil Murugan
    Muthukumaran, V.
    Murugesan, R.
    Joseph, Rose Bindu
    Munirathanam, Meram
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2021,