Enhancing metastatic colorectal cancer prediction through advanced feature selection and machine learning techniques

被引:1
|
作者
Yang, Hui [1 ,2 ]
Liu, Jun [3 ]
Yang, Na [4 ,5 ]
Fu, Qingsheng [3 ]
Wang, Yingying [6 ]
Ye, Mingquan [7 ]
Tao, Shaoneng [6 ]
Liu, Xiaocen [6 ]
Li, Qingqing [7 ]
机构
[1] Yijishan Hosp, Affiliated Hosp 1, Wannan Med Coll, Cent Lab, Wuhu, Anhui, Peoples R China
[2] Anhui Prov Key Lab Noncoding RNA Basic & Clin Tran, Wuhu, Anhui, Peoples R China
[3] Yijishan Hosp, Affiliated Hosp 1, Wannan Med Coll, Dept Gastrointestinal Surg, Wuhu, Anhui, Peoples R China
[4] Yijishan Hosp, Affiliated Hosp 1, Wannan Med Coll, Dept Crit Care Med, Wuhu, Anhui, Peoples R China
[5] Clin Res Ctr Crit Resp Med Anhui Prov, Wuhu, Anhui, Peoples R China
[6] Yijishan Hosp, Affiliated Hosp 1, Wannan Med Coll, Dept Nucl Med, Wuhu 241001, Anhui, Peoples R China
[7] Wannan Med Coll, Res Ctr Hlth Big Data Min & Applicat, Sch Med Informat, Wuhu, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Colorectal cancer; Metastasis prediction; Feature selection; Machine learning; EXPRESSION;
D O I
10.1016/j.intimp.2024.113033
中图分类号
R392 [医学免疫学]; Q939.91 [免疫学];
学科分类号
100102 ;
摘要
Background and aims: Colorectal cancer (CRC) is the third most prevalent cancer globally, posing a significant challenge due to its high rate of metastasis. Approximately 20% of patients with CRC present with distant metastases at diagnosis, and over 50% develop metastases within five years. Accurate prediction of metastasis is crucial for improving survival outcomes in patients with CRC. Methods: This study introduces an innovative cost-sensitive fast correlation-based filter (CS-FCBF) algorithm for feature selection, integrated with machine learning techniques to predict metastatic CRC. The CS-FCBF algorithm effectively reduced the number of genomic features from 184 to 9 critical genes: CXCL9, C2CD4B, RGCC, GFI1, BEX2, CXCL3, FOXQ1, PBK, and PLAG1. The methodology combined in vitro, in vivo, and analysis of publicly available single-cell RNA-seq datasets to validate the findings. Results: The application of the CS-FCBF algorithm led to a significant improvement in prediction model performance, with an average 21.16% increase in the area under the precision-recall curve. The nine identified genes hold potential as diagnostic biomarkers and therapeutic targets for metastatic CRC. Conclusions: This study highlights the critical role of advanced feature selection methods, combined with machine learning, in addressing the challenge of class imbalance in medical diagnosis, particularly for CRC. Early detection of metastasis is vital, and the identified genes underscore their importance in the metastatic process of CRC. The methodology applied here offers valuable insights and paves the way for future research in other cancers or diseases that face similar diagnostic challenges.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] A reliable method for colorectal cancer prediction based on feature selection and support vector machine
    Dandan Zhao
    Hong Liu
    Yuanjie Zheng
    Yanlin He
    Dianjie Lu
    Chen Lyu
    Medical & Biological Engineering & Computing, 2019, 57 : 901 - 912
  • [32] A reliable method for colorectal cancer prediction based on feature selection and support vector machine
    Zhao, Dandan
    Liu, Hong
    Zheng, Yuanjie
    He, Yanlin
    Lu, Dianjie
    Lyu, Chen
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2019, 57 (04) : 901 - 912
  • [33] A Comparative Study for Breast Cancer Prediction using Machine Learning and Feature Selection
    Dhanya, R.
    Paul, Irene Rose
    Akula, Sai Sindhu
    Sivakumar, Madhumathi
    Nair, Jyothisha J.
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 1049 - 1055
  • [34] Feature selection and classification in breast cancer prediction using IoT and machine learning
    Gopal, V. Nanda
    Al-Turjman, Fadi
    Kumar, R.
    Anand, L.
    Rajesh, M.
    MEASUREMENT, 2021, 178
  • [35] Diagnosis of polycystic ovary syndrome through different machine learning and feature selection techniques
    Homay Danaei Mehr
    Huseyin Polat
    Health and Technology, 2022, 12 : 137 - 150
  • [36] Diagnosis of polycystic ovary syndrome through different machine learning and feature selection techniques
    Danaei Mehr, Homay
    Polat, Huseyin
    HEALTH AND TECHNOLOGY, 2022, 12 (01) : 137 - 150
  • [37] A New Multi-Phase Feature Selection Framework for The Prediction of Breast Cancer Drug Using Machine Learning Techniques
    Shobana, G.
    Priya, N.
    JOURNAL OF ALGEBRAIC STATISTICS, 2022, 13 (02) : 300 - 312
  • [38] Optimizing machine-learning models for mutagenicity prediction through better feature selection
    Shinada, Nicolas K.
    Koyama, Naoki
    Ikemori, Megumi
    Nishioka, Tomoki
    Hitaoka, Seiji
    Hakura, Atsushi
    Asakura, Shoji
    Matsuoka, Yukiko
    Palaniappan, Sucheendra K.
    MUTAGENESIS, 2022, 37 (3-4) : 191 - 202
  • [39] Soil salinization prediction through feature selection and machine learning at the irrigation district scale
    Xie, Junbo
    Shi, Cong
    Liu, Yang
    Wang, Qi
    Zhong, Zhibo
    He, Shuai
    Wang, Xingpeng
    FRONTIERS IN EARTH SCIENCE, 2025, 12
  • [40] Enhancing Classification Performance through FeatureBoostThyro: A Comparative Study of Machine Learning Algorithms and Feature Selection
    Bhende, Deepali
    Sakarkar, Gopal
    Khandar, Punam
    Uparkar, Satyajit
    Bhave, Arvind
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2024, 20 (04) : 29 - 42