Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta

被引:21
|
作者
Maurya, Neha Shree [1 ]
Kushwah, Shikha [1 ]
Kushwaha, Sandeep [2 ]
Chawade, Aakash [1 ,3 ]
Mani, Ashutosh [1 ]
机构
[1] Motilal Nehru Natl Inst Technol Allahabad, Dept Biotechnol, Prayagraj 211004, India
[2] Natl Inst Anim Biotechnol, Hyderabad 500032, India
[3] Swedish Univ Agr Sci, Dept Plant Breeding, S-23053 Alnarp, Sweden
关键词
GLUCAGON-LIKE PEPTIDE-2; CANCER; EXPRESSION; GLP-2; SURVIVAL; PATHWAY; INSULIN; GROWTH;
D O I
10.1038/s41598-023-33327-4
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Colorectal cancer (CRC) is the third most prevalent cancer type and accounts for nearly one million deaths worldwide. The CRC mRNA gene expression datasets from TCGA and GEO (GSE144259, GSE50760, and GSE87096) were analyzed to find the significant differentially expressed genes (DEGs). These significant genes were further processed for feature selection through boruta and the confirmed features of importance (genes) were subsequently used for ML-based prognostic classification model development. These genes were analyzed for survival and correlation analysis between final genes and infiltrated immunocytes. A total of 770 CRC samples were included having 78 normal and 692 tumor tissue samples. 170 significant DEGs were identified after DESeq2 analysis along with the topconfects R package. The 33 confirmed features of importance-based RF prognostic classification model have given accuracy, precision, recall, and f1-score of 100% with 0% standard deviation. The overall survival analysis had finalized GLP2R and VSTM2A genes that were significantly downregulated in tumor samples and had a strong correlation with immunocyte infiltration. The involvement of these genes in CRC prognosis was further confirmed on the basis of their biological function and literature analysis. The current findings indicate that GLP2R and VSTM2A may play a significant role in CRC progression and immune response suppression.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Heuristic Model to Improve Feature Selection Based on Machine Learning in Data Mining
    Majumdar, Jahin
    Mal, Anwesha
    Gupta, Shruti
    2016 6TH INTERNATIONAL CONFERENCE - CLOUD SYSTEM AND BIG DATA ENGINEERING (CONFLUENCE), 2016, : 73 - 77
  • [22] Hybrid Classification Model of Correlation-based Feature Selection and Support Vector Machine
    Dubey, Vimal Kumar
    Saxena, Amit Kumar
    2016 IEEE INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN ADVANCED COMPUTING (ICCTAC), 2016,
  • [23] Identification of Autism Spectrum Disorder (ASD) using Feature-based Machine Learning Classification Model
    Praveen, Pappula
    Nagendra, Mothe
    Rahul, M. Ashwardh
    Sahithya
    Sai, Shiva
    Shoaib
    2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE COMPUTING AND SMART SYSTEMS, ICSCSS 2024, 2024, : 1378 - 1384
  • [24] Developing a prognostic model using machine learning for disulfidptosis related lncRNA in lung adenocarcinoma
    Yang Pan
    Xuanhong Jin
    Haoting Xu
    Jiandong Hong
    Feng Li
    Taobo Luo
    Jian Zeng
    Scientific Reports, 14 (1)
  • [25] COMPUTER GRAPHICS CLASSIFICATION BASED ON MARICOV PROCESS MODEL AND BOOSTING FEATURE SELECTION TECHNIQUE
    Sutthiwan, Patchara
    Cai, Xiao
    Shi, Yun Q.
    Zhang, Hong
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 2913 - +
  • [26] Data Classification Using Feature Selection And kNN Machine Learning Approach
    Begum, Shemim
    Chakraborty, Debasis
    Sarkar, Ram
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 811 - 814
  • [27] Gearbox faults feature selection and severity classification using machine learning
    Zuber, Ninoslav
    Bajric, Rusmir
    EKSPLOATACJA I NIEZAWODNOSC-MAINTENANCE AND RELIABILITY, 2020, 22 (04): : 748 - 756
  • [28] Text classification based on feature selection and LDA model
    Zheng, C. (csahu@126.com), 1600, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (09):
  • [29] Machine Learning and Deep Learning Based Hybrid Feature Extraction and Classification Model Using Digital Microscopic Bacterial Images
    Kotwal S.
    Rani P.
    Arif T.
    Manhas J.
    SN Computer Science, 4 (5)
  • [30] A Hybrid Model: Multiple Feature Selection Approach Using Transfer Learning for Bacteria Classification
    Nasip, Omer Faruk
    Zengin, Kenan
    TRAITEMENT DU SIGNAL, 2022, 39 (06) : 2123 - 2131