Enhancing metastatic colorectal cancer prediction through advanced feature selection and machine learning techniques

被引:1
|
作者
Yang, Hui [1 ,2 ]
Liu, Jun [3 ]
Yang, Na [4 ,5 ]
Fu, Qingsheng [3 ]
Wang, Yingying [6 ]
Ye, Mingquan [7 ]
Tao, Shaoneng [6 ]
Liu, Xiaocen [6 ]
Li, Qingqing [7 ]
机构
[1] Yijishan Hosp, Affiliated Hosp 1, Wannan Med Coll, Cent Lab, Wuhu, Anhui, Peoples R China
[2] Anhui Prov Key Lab Noncoding RNA Basic & Clin Tran, Wuhu, Anhui, Peoples R China
[3] Yijishan Hosp, Affiliated Hosp 1, Wannan Med Coll, Dept Gastrointestinal Surg, Wuhu, Anhui, Peoples R China
[4] Yijishan Hosp, Affiliated Hosp 1, Wannan Med Coll, Dept Crit Care Med, Wuhu, Anhui, Peoples R China
[5] Clin Res Ctr Crit Resp Med Anhui Prov, Wuhu, Anhui, Peoples R China
[6] Yijishan Hosp, Affiliated Hosp 1, Wannan Med Coll, Dept Nucl Med, Wuhu 241001, Anhui, Peoples R China
[7] Wannan Med Coll, Res Ctr Hlth Big Data Min & Applicat, Sch Med Informat, Wuhu, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Colorectal cancer; Metastasis prediction; Feature selection; Machine learning; EXPRESSION;
D O I
10.1016/j.intimp.2024.113033
中图分类号
R392 [医学免疫学]; Q939.91 [免疫学];
学科分类号
100102 ;
摘要
Background and aims: Colorectal cancer (CRC) is the third most prevalent cancer globally, posing a significant challenge due to its high rate of metastasis. Approximately 20% of patients with CRC present with distant metastases at diagnosis, and over 50% develop metastases within five years. Accurate prediction of metastasis is crucial for improving survival outcomes in patients with CRC. Methods: This study introduces an innovative cost-sensitive fast correlation-based filter (CS-FCBF) algorithm for feature selection, integrated with machine learning techniques to predict metastatic CRC. The CS-FCBF algorithm effectively reduced the number of genomic features from 184 to 9 critical genes: CXCL9, C2CD4B, RGCC, GFI1, BEX2, CXCL3, FOXQ1, PBK, and PLAG1. The methodology combined in vitro, in vivo, and analysis of publicly available single-cell RNA-seq datasets to validate the findings. Results: The application of the CS-FCBF algorithm led to a significant improvement in prediction model performance, with an average 21.16% increase in the area under the precision-recall curve. The nine identified genes hold potential as diagnostic biomarkers and therapeutic targets for metastatic CRC. Conclusions: This study highlights the critical role of advanced feature selection methods, combined with machine learning, in addressing the challenge of class imbalance in medical diagnosis, particularly for CRC. Early detection of metastasis is vital, and the identified genes underscore their importance in the metastatic process of CRC. The methodology applied here offers valuable insights and paves the way for future research in other cancers or diseases that face similar diagnostic challenges.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Enhancing stroke disease classification through machine learning models via a novel voting system by feature selection techniques
    Hasan, Mahade
    Yasmin, Farhana
    Hassan, Md. Mehedi
    Yu, Xue
    Yeasmin, Soniya
    Joshi, Herat
    Islam, Sheikh Mohammed Shariful
    PLOS ONE, 2025, 20 (01):
  • [22] A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning
    Khalid, Samina
    Khalil, Tehmina
    Nasreen, Shamila
    2014 SCIENCE AND INFORMATION CONFERENCE (SAI), 2014, : 372 - 378
  • [23] Enhancing SIoT Security Through Advanced Machine Learning Techniques for Intrusion Detection
    Divya, S.
    Tanuja, R.
    COMMUNICATION AND INTELLIGENT SYSTEMS, VOL 1, ICCIS 2023, 2024, 967 : 105 - 116
  • [24] Diabetes Prediction: Optimization of Machine Learning through Feature Selection and Dimensionality Reduction
    Aouragh, Abd Allah
    Bahaj, Mohamed
    Toufik, Fouad
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2024, 20 (08) : 100 - 114
  • [25] Enhancing Heart Disease Prediction Accuracy through Machine Learning Techniques and Optimization
    Chandrasekhar, Nadikatla
    Peddakrishna, Samineni
    PROCESSES, 2023, 11 (04)
  • [26] Enhancing spatial streamflow prediction through machine learning algorithms and advanced strategies
    Cheghabaleki, Sedigheh Darabi
    Fatemi, Seyed Ehsan
    Mavadat, Maryam Hafezparast
    APPLIED WATER SCIENCE, 2024, 14 (06)
  • [27] Enhancing Vault Prediction and ICL Sizing Through Advanced Machine Learning Models
    Zhu, Jun
    Li, Fen -Fen
    Li, Gao-Xiang
    Jiang, Shang -Yang
    Cheng, Dan
    Bao, Fang -Jun
    Wu, Shuang-Qing
    Dai, Qi
    Ye, Yu-Feng
    JOURNAL OF REFRACTIVE SURGERY, 2024, 40 (03) : e126 - e132
  • [28] Lung cancer prediction using machine learning and advanced imaging techniques
    Kadir, Timor
    Gleeson, Fergus
    TRANSLATIONAL LUNG CANCER RESEARCH, 2018, 7 (03) : 304 - 312
  • [29] Enhancing machine learning-based sentiment analysis through feature extraction techniques
    Semary, Noura A.
    Ahmed, Wesam
    Amin, Khalid
    Plawiak, Pawel
    Hammad, Mohamed
    PLOS ONE, 2024, 19 (02):
  • [30] Machine-Learning Techniques for Feature Selection and Prediction of Mortality in Elderly CABG Patients
    Huang, Yen-Chun
    Li, Shao-Jung
    Chen, Mingchih
    Lee, Tian-Shyug
    Chien, Yu-Ning
    HEALTHCARE, 2021, 9 (05)