Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta

被引:21
|
作者
Maurya, Neha Shree [1 ]
Kushwah, Shikha [1 ]
Kushwaha, Sandeep [2 ]
Chawade, Aakash [1 ,3 ]
Mani, Ashutosh [1 ]
机构
[1] Motilal Nehru Natl Inst Technol Allahabad, Dept Biotechnol, Prayagraj 211004, India
[2] Natl Inst Anim Biotechnol, Hyderabad 500032, India
[3] Swedish Univ Agr Sci, Dept Plant Breeding, S-23053 Alnarp, Sweden
关键词
GLUCAGON-LIKE PEPTIDE-2; CANCER; EXPRESSION; GLP-2; SURVIVAL; PATHWAY; INSULIN; GROWTH;
D O I
10.1038/s41598-023-33327-4
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Colorectal cancer (CRC) is the third most prevalent cancer type and accounts for nearly one million deaths worldwide. The CRC mRNA gene expression datasets from TCGA and GEO (GSE144259, GSE50760, and GSE87096) were analyzed to find the significant differentially expressed genes (DEGs). These significant genes were further processed for feature selection through boruta and the confirmed features of importance (genes) were subsequently used for ML-based prognostic classification model development. These genes were analyzed for survival and correlation analysis between final genes and infiltrated immunocytes. A total of 770 CRC samples were included having 78 normal and 692 tumor tissue samples. 170 significant DEGs were identified after DESeq2 analysis along with the topconfects R package. The 33 confirmed features of importance-based RF prognostic classification model have given accuracy, precision, recall, and f1-score of 100% with 0% standard deviation. The overall survival analysis had finalized GLP2R and VSTM2A genes that were significantly downregulated in tumor samples and had a strong correlation with immunocyte infiltration. The involvement of these genes in CRC prognosis was further confirmed on the basis of their biological function and literature analysis. The current findings indicate that GLP2R and VSTM2A may play a significant role in CRC progression and immune response suppression.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Development and validation of a multimodal feature fusion prognostic model for lumbar degenerative disease based on machine learning: a study protocol
    Wang, Zhipeng
    Zhao, Xiyun
    Li, Yuanzhen
    Zhang, Hongwei
    Qin, Daping
    Qi, Xin
    Chen, Yixin
    Zhang, Xiaogang
    BMJ OPEN, 2023, 13 (09):
  • [42] A novel interpretability machine learning model for wind speed forecasting based on feature and sub-model selection
    Shang, Zhihao
    Chen, Yanhua
    Lai, Daokai
    Li, Min
    Yang, Yi
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [43] Developing a machine learning based building energy consumption prediction approach using limited data: Boruta feature selection and empirical mode decomposition
    Qiao, Qingyao
    Yunusa-Kaltungo, Akilu
    Edwards, Rodger E.
    ENERGY REPORTS, 2023, 9 : 3643 - 3660
  • [44] Machine learning-based immune prognostic model and ceRNA network construction for lung adenocarcinoma
    He, Xiaoqian
    Su, Ying
    Liu, Pei
    Chen, Cheng
    Chen, Chen
    Guan, Haoqin
    Lv, Xiaoyi
    Guo, Wenjia
    JOURNAL OF CANCER RESEARCH AND CLINICAL ONCOLOGY, 2023, 149 (10) : 7379 - 7392
  • [45] Machine learning-based immune prognostic model and ceRNA network construction for lung adenocarcinoma
    Xiaoqian He
    Ying Su
    Pei Liu
    Cheng Chen
    Chen Chen
    Haoqin Guan
    Xiaoyi Lv
    Wenjia Guo
    Journal of Cancer Research and Clinical Oncology, 2023, 149 : 7379 - 7392
  • [46] Development and Validation of a Machine Learning-Based Prognostic Model for Atypical Meningioma
    Kim, D.
    Kim, Y.
    Sung, W.
    Kim, I. A.
    Cho, J.
    Lee, J. H.
    Grassberger, C.
    Byun, H. K.
    Chang, W. I.
    Ren, L.
    Gong, Y.
    Wee, C. W.
    Hua, L.
    Yoon, H. I.
    MEDICAL PHYSICS, 2024, 51 (10) : 7968 - 7968
  • [47] Prognostic Model Development with Missing LabelsA Condition-Based Maintenance Approach Using Machine Learning
    Patrick Zschech
    Kai Heinrich
    Raphael Bink
    Janis S. Neufeld
    Business & Information Systems Engineering, 2019, 61 : 327 - 343
  • [48] Prognostic Model Development with Missing Labels: A Condition-Based Maintenance Approach Using Machine Learning
    Zschech, Patrick
    Heinrich, Kai
    Bink, Raphael
    Neufeld, Janis S.
    BUSINESS & INFORMATION SYSTEMS ENGINEERING, 2019, 61 (03) : 327 - 343
  • [49] Machine learning approaches for classification of colorectal cancer with and without feature selection method on microarray data
    Nazari, Elham
    Aghemiri, Mehran
    Avan, Amir
    Mehrabian, Amin
    Tabesh, Hamed
    GENE REPORTS, 2021, 25
  • [50] Base Station Model Selection Using Machine Learning Technique for Wireless Sensor Network
    Pooja Gulganwa
    Saurabh Jain
    Wireless Personal Communications, 2023, 132 : 1225 - 1239