Machine Learning Improves Prediction Over Logistic Regression on Resected Colon Cancer Patients

被引:19
|
作者
Leonard, Grey [1 ]
South, Charles [2 ]
Balentine, Courtney [1 ,3 ,4 ]
Porembka, Matthew [1 ]
Mansour, John [1 ]
Wang, Sam [1 ]
Yopp, Adam [1 ]
Polanco, Patricio [1 ]
Zeh, Herbert [1 ]
Augustine, Mathew [1 ,3 ]
机构
[1] Univ Texas Southwestern Med Ctr Dallas, Dept Surg, Dallas, TX 75390 USA
[2] Southern Methodist Univ, Dept Stat Sci, Dallas, TX USA
[3] VA North Texas Healthcare Syst, Dallas, TX USA
[4] UTSW Surg Ctr Outcomes Implementat & Novel Interv, Dallas, TX USA
关键词
Colon cancer; Prediction; Machine learning; Outcomes; Risk; READMISSION; COMPLICATIONS; MODEL; RISK; MORTALITY; COLECTOMY; ADULTS; COST;
D O I
10.1016/j.jss.2022.01.012
中图分类号
R61 [外科手术学];
学科分类号
摘要
Introduction: Despite advances, readmission and mortality rates for surgical patients with colon cancer remain high. Prediction models using regression techniques allows for risk stratification to aid periprocedural care. Technological advances have enabled large data to be analyzed using machine learning (ML) algorithms. A national database of colon cancer patients was selected to determine whether ML methods better predict outcomes following surgery compared to conventional methods. Methods: Surgical colon cancer patients were identified using the 2013 National Cancer Database (NCDB). The negative outcome was defined as a composite of 30-d unplanned readmission and 30-and 90-d mortality. ML models, including Random Forest and XGBoost, were built and compared with conventional logistic regression. For the ac-counting of unbalanced outcomes, a synthetic minority oversampling technique (SMOTE) was implemented and applied using XGBoost. Results: Analysis included 528,060 patients. The negative outcome occurred in 11.6% of patients. Model building utilized 30 variables. The primary metric for model comparison was area under the curve (AUC). In comparison to logistic regression (AUC 0.730, 95% CI: 0.725-0.735), AUC's for ML algorithms ranged between 0.748 and 0.757, with the Random Forest model (AUC 0.757, 95% CI: 0.752-0.762) outperforming XGBoost (AUC 0.756, 95% CI: 0.751-0.761) and XGBoost using SMOTE data (AUC 0.748, 95% CI: 0.743-0.753). Conclusions: We show that a large registry of surgical colon cancer patients can be utilized to build ML models to improve outcome prediction with differential discriminative ability. These results reveal the potential of these methods to enhance risk prediction, leading to improved strategies to mitigate those risks. (c) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:181 / 193
页数:13
相关论文
共 50 条
  • [11] Machine learning versus logistic regression for the prediction of complication after pancreatoduodenectomy Response
    Ingwersen, Erik W.
    Daams, F.
    SURGERY, 2024, 175 (05) : 1467 - 1467
  • [12] Prediction of preterm birth in nulliparous women using logistic regression and machine learning
    Belaghi, Reza Arabi
    Beyene, Joseph
    McDonald, Sarah D.
    PLOS ONE, 2021, 16 (06):
  • [13] Prediction of the rate of penetration using logistic regression algorithm of machine learning model
    Deng S.
    Wei M.
    Xu M.
    Cai W.
    Arabian Journal of Geosciences, 2021, 14 (21)
  • [14] Logistic Regression for Machine Learning in Process Tomography
    Rymarczyk, Tomasz
    Kozlowski, Edward
    Klosowski, Grzegorz
    Niderla, Konrad
    SENSORS, 2019, 19 (15)
  • [15] Mortality risk prediction in burn injury: Comparison of logistic regression with machine learning approaches
    Stylianou, Neophytos
    Akbarov, Artur
    Kontopantelis, Evangelos
    Buchan, Iain
    Dunn, Ken W.
    BURNS, 2015, 41 (05) : 925 - 934
  • [16] Invited commentary: Machine learning versus logistic regression for the prediction of complications after pancreatoduodenectomy
    Kambakamba, Patryk
    SURGERY, 2023, 174 (03) : 441 - 441
  • [17] Application of logistic regression and machine learning methods for idiopathic inflammatory myopathies malignancy prediction
    Zhang, Weijin
    Huang, Guohai
    Zheng, Kedi
    Lin, Jianqun
    Hu, Shijian
    Zheng, Shaoyu
    Du, Guangzhou
    Zhang, Guohong
    Bruni, Cosimo
    Matucci-Cerinic, Marco
    Furst, Daniel E.
    Wang, Yukai
    CLINICAL AND EXPERIMENTAL RHEUMATOLOGY, 2023, 41 (02) : 330 - 339
  • [18] Comparison of logistic regression and machine learning techniques in prediction of habitat distribution of plant species
    Sahragard, Hossein Piri
    Chahouki, Mohammad Ali Zare
    RANGE MANAGEMENT AND AGROFORESTRY, 2016, 37 (01) : 21 - 26
  • [19] Estimation of Prediction for Getting Heart Disease Using Logistic Regression Model of Machine Learning
    Saw, Montu
    Saxena, Tarun
    Kaithwas, Sanjana
    Yadav, Rahul
    Lal, Nidhi
    2020 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI - 2020), 2020, : 253 - 257
  • [20] Prediction of preterm birth in multiparous women using logistic regression and machine learning approaches
    Belaghi, Reza Arabi
    SCIENTIFIC REPORTS, 2024, 14 (01):