Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study

被引:7
|
作者
Yang, Xulin [1 ]
Qiu, Hang [1 ,2 ]
Wang, Liya [2 ]
Wang, Xiaodong [3 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, 2006 Xiyuan Ave, Chengdu 611731, Peoples R China
[2] Univ Elect Sci & Technol China, Big Data Res Ctr, Chengdu, Peoples R China
[3] Sichuan Univ, West China Hosp, Dept Gastrointestinal Surg, Chengdu, Peoples R China
关键词
colorectal cancer; survival prediction; machine learning; time-to-event; SHAP; SHapley Additive exPlanations; DIAGNOSIS; MODELS;
D O I
10.2196/44417
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: Machine learning (ML) methods have shown great potential in predicting colorectal cancer (CRC) survival. However, the ML models introduced thus far have mainly focused on binary outcomes and have not considered the time-to-event nature of this type of modeling. Objective: This study aims to evaluate the performance of ML approaches for modeling time-to-event survival data and develop transparent models for predicting CRC-specific survival. Methods: The data set used in this retrospective cohort study contains information on patients who were newly diagnosed with CRC between December 28, 2012, and December 27, 2019, at West China Hospital, Sichuan University. We assessed the performance of 6 representative ML models, including random survival forest (RSF), gradient boosting machine (GBM), DeepSurv, DeepHit, neural net-extended time-dependent Cox (or Cox-Time), and neural multitask logistic regression (N-MTLR) in predicting CRC-specific survival. Multiple imputation by chained equations method was applied to handle missing values in variables. Multivariable analysis and clinical experience were used to select significant features associated with CRC survival. Model performance was evaluated in stratified 5-fold cross-validation repeated 5 times by using the time-dependent concordance index, integrated Brier score, calibration curves, and decision curves. The SHapley Additive exPlanations method was applied to calculate feature importance. Results: A total of 2157 patients with CRC were included in this study. Among the 6 time-to-event ML models, the DeepHit model exhibited the best discriminative ability (time-dependent concordance index 0.789, 95% CI 0.779-0.799) and the RSF model produced better-calibrated survival estimates (integrated Brier score 0.096, 95% CI 0.094-0.099), but these are not statistically significant. Additionally, the RSF, GBM, DeepSurv, Cox-Time, and N-MTLR models have comparable predictive accuracy to the Cox Proportional Hazards model in terms of discrimination and calibration. The calibration curves showed that all the ML models exhibited good 5-year survival calibration. The decision curves for CRC-specific survival at 5 years showed that all the ML models, especially RSF, had higher net benefits than default strategies of treating all or no patients at a range of clinically reasonable risk thresholds. The SHapley Additive exPlanations method revealed that R0 resection, tumor-node-metastasis staging, and the number of positive lymph nodes were important factors for 5-year CRC-specific survival. Conclusions: This study showed the potential of applying time-to-event ML predictive algorithms to help predict CRC-specific survival. The RSF, GBM, Cox-Time, and N-MTLR algorithms could provide nonparametric alternatives to the Cox Proportional Hazards model in estimating the survival probability of patients with CRC. The transparent time-to-event ML models help clinicians to more accurately predict the survival rate for these patients and improve patient outcomes by enabling personalized treatment plans that are informed by explainable ML models.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] A Naive Bayes machine learning approach to risk prediction using censored, time-to-event data
    Wolfson, Julian
    Bandyopadhyay, Sunayan
    Elidrisi, Mohamed
    Vazquez-Benitez, Gabriela
    Vock, David M.
    Musgrove, Donald
    Adomavicius, Gediminas
    Johnson, Paul E.
    O'Connor, Patrick J.
    STATISTICS IN MEDICINE, 2015, 34 (21) : 2941 - 2957
  • [32] Predicting Metabolic Syndrome With Machine Learning Models Using a Decision Tree Algorithm: Retrospective Cohort Study
    Yu, Cheng-Sheng
    Lin, Yu-Jiun
    Lin, Chang-Hsien
    Wang, Sen-Te
    Lin, Shiyng-Yu
    Lin, Sanders H.
    Wu, Jenny L.
    Chang, Shy-Shin
    JMIR MEDICAL INFORMATICS, 2020, 8 (03)
  • [33] A Machine Learning Approach for High-Dimensional Time-to-Event Prediction With Application to Immunogenicity of Biotherapies in the ABIRISK Cohort
    Duhaze, Julianne
    Hassler, Signe
    Bachelet, Delphine
    Gleizes, Aude
    Hacein-Bey-Abina, Salima
    Allez, Matthieu
    Deisenhammer, Florian
    Fogdell-Hahn, Anna
    Mariette, Xavier
    Pallardy, Marc
    Broet, Philippe
    FRONTIERS IN IMMUNOLOGY, 2020, 11
  • [34] Predicting Fetal Alcohol Spectrum Disorders Using Machine Learning Techniques: Multisite Retrospective Cohort Study
    Oh, Sarah Soyeon
    Kuang, Irene
    Jeong, Hyewon
    Song, Jin-Yeop
    Ren, Boyu
    Moon, Jong Youn
    Park, Eun-Cheol
    Kawachi, Ichiro
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2023, 25
  • [35] Deep neural networks integrating genomics and histopathological images for predicting stages and survival time-to-event in colon cancer
    Ogundipe, Olalekan
    Kurt, Zeyneb
    Woo, Wai Lok
    PLOS ONE, 2024, 19 (09):
  • [36] Prediction of Cardiovascular Disease in Patients with Systemic Lupus Erythematosus Using a Machine Learning Algorithm for Time-to-Event Outcomes: Random Survival Forest
    Liu, Hsin Yen
    Su, Jiandong
    Bonilla, Dennisse
    Duaibes, Sara
    Martinez, Juan Pablo Diaz
    Touma, Zahi
    ARTHRITIS & RHEUMATOLOGY, 2023, 75 : 2897 - 2900
  • [37] A multi-omics machine learning framework in predicting the survival of colorectal cancer patients
    Yang, Min
    Yang, Huandong
    Ji, Lei
    Hu, Xuan
    Tian, Geng
    Wang, Bing
    Yang, Jialiang
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 146
  • [38] Machine learning for predicting survival of colorectal cancer patients (vol 13, 8874, 2023)
    Cardoso, Lucas Buk
    Parro, Vanderlei Cunha
    Peres, Stela Verzinhasse
    Curado, Maria Paula
    Fernandes, Gisele Aparecida
    Filho, Victor Wuensch
    Toporcov, Tatiana Natasha
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [39] Predicting events in clinical trials using two time-to-event outcomes
    Mu, Rongji
    Xu, Jin
    BIOMETRICAL JOURNAL, 2018, 60 (04) : 815 - 826
  • [40] Development and validation of machine learning models for predicting venous thromboembolism in colorectal cancer patients: A cohort study in China
    Hu, Zuhai
    Li, Xiaosheng
    Yuan, Yuliang
    Xu, Qianjie
    Zhang, Wei
    Lei, Haike
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2025, 195