Lung cancer survival prognosis using a two-stage modeling approach

被引：0

作者：

Aggarwal, Preeti ^{[1
]}

Marwah, Namrata ^{[1
]}

Kaur, Ravreet ^{[1
]}

Mittal, Ajay ^{[1
]}

机构：

[1] Panjab Univ, Dept Comp Engn, UIET, Chandigarh 160014, India

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2024年 / 83卷 / 22期

关键词：

SEER; Machine learning; Lung cancer; Survival prediction; Feature selection; DIMENSIONALITY REDUCTION; PREDICTION; CLASSIFICATION; DIAGNOSIS;

D O I：

10.1007/s11042-024-18280-2

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Lung cancer, the second most prevalent form of cancer with the highest mortality rate, necessitates the stratification of patients based on their survival rates to develop effective treatment strategies. This study presents a two-stage framework for predicting lung cancer survival. The initial stage, classification, focuses on forecasting the five-year survival probability of lung cancer patients. Subsequent analysis was conducted on patients accurately classified as deceased during this stage. The second stage, regression, predicts the actual survival duration in months for deceased patients. This analysis employs the widely recognized Surveillance, Epidemiology, and End Results (SEER) database. To reduce dimensionality, two feature selection techniques, Recursive Feature Elimination with Random Forest (RFE-RF) and the Least Absolute Shrinkage and Selection Operator (LASSO), were adopted. Machine learning models were then trained using five-fold cross-validation for both classification and regression. Experimental results demonstrate that ensemble methods outperform other algorithms, including Logistic Regression (LR), Random Forest (RF), Multilayer Perceptron (MLP), Adaboost, and Naive Bayes (NB), in terms of performance metrics. The existing techniques offer high accuracy for shorter survival periods, particularly for survival times of up to 6 months. Notably, the Light Gradient Boosting Machine (LGBM) classifier combined with RFE-RF achieves the highest classification accuracy of 89.6% and an area under the receiver operating characteristic (ROC) curve (AUC) score of 92.03 for survival durations up to 11 months. In regression analysis, the LGBM regressor outperforms its counterparts with a Mean Absolute Error (MAE) value of 7.53 and a Root Mean Squared Error (RMSE) value of 10.49. The study critically evaluates various cost functions' effectiveness in regression, validating the accuracy of survival duration predictions for the given dataset.

引用

页码：61407 / 61434

页数：28

共 50 条

[1] A two-stage modeling approach for breast cancer survivability prediction
Sedighi-Maman, Zahra
Mondello, Alexa
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2021, 149
[2] Modeling survival in childhood cancer studies using two-stage non-mixture cure models
Weston, Claire L.
Thompson, John R.
JOURNAL OF APPLIED STATISTICS, 2010, 37 (09) : 1523 - 1535
[3] Using two-stage approach to clustering
Yue, Shihong
Song, Kai
Li, Yi
2006 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY, VOLS 1-6, 2006, : 488 - +
[4] A two-stage approach to modeling vacant taxi movements
Wong, R. C. P.
Szeto, W. Y.
Wong, S. C.
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2015, 59 : 147 - 163
[5] Pathway-Structured Predictive Model for Cancer Survival Prediction: A Two-Stage Approach
Zhang, Xinyan
Li, Yan
Akinyemiju, Tomi
Ojesina, Akinyemi I.
Buckhaults, Phillip
Liu, Nianjun
Xu, Bo
Yi, Nengjun
GENETICS, 2017, 205 (01) : 89 - +
[6] A Two-Stage Approach to Modeling Vacant Taxi Movements
Wong, R. C. P.
Szeto, W. Y.
Wong, S. C.
21ST INTERNATIONAL SYMPOSIUM ON TRANSPORTATION AND TRAFFIC THEORY, 2015, 7 : 254 - 275
[7] A TWO-STAGE DEEP MODELING APPROACH TO ARTICULATORY INVERSION
Shahrebabaki, Abdolreza Sabzi
Olfati, Negar
Imran, Ali Shariq
Johnsen, Magne Hallstein
Siniscalchi, Sabato Marco
Svendsen, Torbjorn
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6453 - 6457
[8] Detecting Circles Using a Two-Stage Approach
Wen-Yen Wu
JournalofElectronicScienceandTechnology, 2014, 12 (03) : 318 - 321
[9] Detecting Circles Using a Two-Stage Approach
Wen-Yen Wu
Journal of Electronic Science and Technology, 2014, (03) : 318 - 321
[10] A hybrid two-stage approach for paroxysmal atrial fibrillation prognosis problem
Lynn, KS
Chiang, HD
COMPUTERS IN CARDIOLOGY 2002, VOL 29, 2002, 29 : 481 - 484

← 1 2 3 4 5 →