Lung cancer, the second most prevalent form of cancer with the highest mortality rate, necessitates the stratification of patients based on their survival rates to develop effective treatment strategies. This study presents a two-stage framework for predicting lung cancer survival. The initial stage, classification, focuses on forecasting the five-year survival probability of lung cancer patients. Subsequent analysis was conducted on patients accurately classified as deceased during this stage. The second stage, regression, predicts the actual survival duration in months for deceased patients. This analysis employs the widely recognized Surveillance, Epidemiology, and End Results (SEER) database. To reduce dimensionality, two feature selection techniques, Recursive Feature Elimination with Random Forest (RFE-RF) and the Least Absolute Shrinkage and Selection Operator (LASSO), were adopted. Machine learning models were then trained using five-fold cross-validation for both classification and regression. Experimental results demonstrate that ensemble methods outperform other algorithms, including Logistic Regression (LR), Random Forest (RF), Multilayer Perceptron (MLP), Adaboost, and Naive Bayes (NB), in terms of performance metrics. The existing techniques offer high accuracy for shorter survival periods, particularly for survival times of up to 6 months. Notably, the Light Gradient Boosting Machine (LGBM) classifier combined with RFE-RF achieves the highest classification accuracy of 89.6% and an area under the receiver operating characteristic (ROC) curve (AUC) score of 92.03 for survival durations up to 11 months. In regression analysis, the LGBM regressor outperforms its counterparts with a Mean Absolute Error (MAE) value of 7.53 and a Root Mean Squared Error (RMSE) value of 10.49. The study critically evaluates various cost functions' effectiveness in regression, validating the accuracy of survival duration predictions for the given dataset.