Lung cancer survival prognosis using a two-stage modeling approach

被引:0
|
作者
Aggarwal, Preeti [1 ]
Marwah, Namrata [1 ]
Kaur, Ravreet [1 ]
Mittal, Ajay [1 ]
机构
[1] Panjab Univ, Dept Comp Engn, UIET, Chandigarh 160014, India
关键词
SEER; Machine learning; Lung cancer; Survival prediction; Feature selection; DIMENSIONALITY REDUCTION; PREDICTION; CLASSIFICATION; DIAGNOSIS;
D O I
10.1007/s11042-024-18280-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Lung cancer, the second most prevalent form of cancer with the highest mortality rate, necessitates the stratification of patients based on their survival rates to develop effective treatment strategies. This study presents a two-stage framework for predicting lung cancer survival. The initial stage, classification, focuses on forecasting the five-year survival probability of lung cancer patients. Subsequent analysis was conducted on patients accurately classified as deceased during this stage. The second stage, regression, predicts the actual survival duration in months for deceased patients. This analysis employs the widely recognized Surveillance, Epidemiology, and End Results (SEER) database. To reduce dimensionality, two feature selection techniques, Recursive Feature Elimination with Random Forest (RFE-RF) and the Least Absolute Shrinkage and Selection Operator (LASSO), were adopted. Machine learning models were then trained using five-fold cross-validation for both classification and regression. Experimental results demonstrate that ensemble methods outperform other algorithms, including Logistic Regression (LR), Random Forest (RF), Multilayer Perceptron (MLP), Adaboost, and Naive Bayes (NB), in terms of performance metrics. The existing techniques offer high accuracy for shorter survival periods, particularly for survival times of up to 6 months. Notably, the Light Gradient Boosting Machine (LGBM) classifier combined with RFE-RF achieves the highest classification accuracy of 89.6% and an area under the receiver operating characteristic (ROC) curve (AUC) score of 92.03 for survival durations up to 11 months. In regression analysis, the LGBM regressor outperforms its counterparts with a Mean Absolute Error (MAE) value of 7.53 and a Root Mean Squared Error (RMSE) value of 10.49. The study critically evaluates various cost functions' effectiveness in regression, validating the accuracy of survival duration predictions for the given dataset.
引用
收藏
页码:61407 / 61434
页数:28
相关论文
共 50 条
  • [1] A two-stage modeling approach for breast cancer survivability prediction
    Sedighi-Maman, Zahra
    Mondello, Alexa
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2021, 149
  • [2] Modeling survival in childhood cancer studies using two-stage non-mixture cure models
    Weston, Claire L.
    Thompson, John R.
    JOURNAL OF APPLIED STATISTICS, 2010, 37 (09) : 1523 - 1535
  • [3] Using two-stage approach to clustering
    Yue, Shihong
    Song, Kai
    Li, Yi
    2006 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY, VOLS 1-6, 2006, : 488 - +
  • [4] A two-stage approach to modeling vacant taxi movements
    Wong, R. C. P.
    Szeto, W. Y.
    Wong, S. C.
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2015, 59 : 147 - 163
  • [5] Pathway-Structured Predictive Model for Cancer Survival Prediction: A Two-Stage Approach
    Zhang, Xinyan
    Li, Yan
    Akinyemiju, Tomi
    Ojesina, Akinyemi I.
    Buckhaults, Phillip
    Liu, Nianjun
    Xu, Bo
    Yi, Nengjun
    GENETICS, 2017, 205 (01) : 89 - +
  • [6] A Two-Stage Approach to Modeling Vacant Taxi Movements
    Wong, R. C. P.
    Szeto, W. Y.
    Wong, S. C.
    21ST INTERNATIONAL SYMPOSIUM ON TRANSPORTATION AND TRAFFIC THEORY, 2015, 7 : 254 - 275
  • [7] A TWO-STAGE DEEP MODELING APPROACH TO ARTICULATORY INVERSION
    Shahrebabaki, Abdolreza Sabzi
    Olfati, Negar
    Imran, Ali Shariq
    Johnsen, Magne Hallstein
    Siniscalchi, Sabato Marco
    Svendsen, Torbjorn
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6453 - 6457
  • [8] Detecting Circles Using a Two-Stage Approach
    Wen-Yen Wu
    JournalofElectronicScienceandTechnology, 2014, 12 (03) : 318 - 321
  • [9] Detecting Circles Using a Two-Stage Approach
    Wen-Yen Wu
    Journal of Electronic Science and Technology, 2014, (03) : 318 - 321
  • [10] A hybrid two-stage approach for paroxysmal atrial fibrillation prognosis problem
    Lynn, KS
    Chiang, HD
    COMPUTERS IN CARDIOLOGY 2002, VOL 29, 2002, 29 : 481 - 484