An AI-driven Predictive Model for Pancreatic Cancer Patients Using Extreme Gradient Boosting

被引:4
|
作者
Chakraborty, Aditya [1 ]
Tsokos, Chris P. [2 ]
机构
[1] Eastern Virginia Med Sch, Norfolk, VA 23507 USA
[2] Univ S Florida, Tampa, FL USA
来源
关键词
Pancreatic Cancer; Extreme Gradient Boosting; Boosted Regression Trees; Pancreatic Risk Factors; Grid Search Mechanism; REGRESSION; NETWORKS; XGBOOST;
D O I
10.1007/s44199-023-00063-7
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Pancreatic cancer is one of the deadliest carcinogenic diseases affecting people all over the world. The majority of patients are usually detected at Stage III or Stage IV, and the chances of survival are very low once detected at the late stages. This study focuses on building an efficient data-driven analytical predictive model based on the associated risk factors and identifying the most contributing factors influencing the survival times of patients diagnosed with pancreatic cancer using the XGBoost (eXtreme Gradient Boosting) algorithm. The grid-search mechanism was implemented to compute the optimum values of the hyper-parameters of the analytical model by minimizing the root mean square error (RMSE). The optimum hyperparameters of the final analytical model were selected by comparing the values with 243 competing models. To check the validity of the model, we compared the model's performance with ten deep neural network models, grown sequentially with different activation functions and optimizers. We also constructed an ensemble model using Gradient Boosting Machine (GBM). The proposed XGBoost model outperformed all competing models we considered with regard to root mean square error (RMSE). After developing the model, the individual risk factors were ranked according to their individual contribution to the response predictions, which is extremely important for pancreatic research organizations to spend their resources on the risk factors causing/influencing the particular type of cancer. The three most influencing risk factors affecting the survival of pancreatic cancer patients were found to be the age of the patient, current BMI, and cigarette smoking years with contributing percentages of 35.5%, 24.3%, and 14.93%, respectively. The predictive model is approximately 96.42% accurate in predicting the survival times of the patients diagnosed with pancreatic cancer and performs excellently on test data. The analytical methodology of developing the model can be utilized for prediction purposes. It can be utilized to predict the time to death related to a specific type of cancer, given a set of numeric, and non-numeric features.
引用
收藏
页码:262 / 282
页数:21
相关论文
共 50 条
  • [1] An AI-driven Predictive Model for Pancreatic Cancer Patients Using Extreme Gradient Boosting
    Aditya Chakraborty
    Chris P. Tsokos
    Journal of Statistical Theory and Applications, 2023, 22 : 262 - 282
  • [2] AI-driven predictive models for sustainability
    Olawumi, Mattew A.
    Oladapo, Bankole I.
    JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2025, 373
  • [3] Explainable AI-driven model for gastrointestinal cancer classification
    Binzagr, Faisal
    FRONTIERS IN MEDICINE, 2024, 11
  • [4] Revolutionizing Credit Risk: A Deep Dive into Gradient-Boosting Techniques in AI-Driven Finance
    Bin Tareaf, Raad
    AbuJarour, Mohammed
    Zinn, Fabian
    38TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN 2024, 2024, : 322 - 327
  • [5] Construction and Validation of a Predictive Model for Coronary Artery Disease Using Extreme Gradient Boosting
    Zhang, Zheng
    Shao, Binbin
    Liu, Hongzhou
    Huang, Ben
    Gao, Xuechen
    Qiu, Jun
    Wang, Chen
    JOURNAL OF INFLAMMATION RESEARCH, 2024, 17 : 4163 - 4174
  • [6] Predictive maintenance for printed circuit boards using eXtreme gradient boosting
    Huang, Chien-Yi
    Hsieh, Hao-Chun
    Li, Yan-Cheng
    MICROELECTRONICS INTERNATIONAL, 2025,
  • [7] Remote Diagnosis and Triaging Model for Skin Cancer Using EfficientNet and Extreme Gradient Boosting
    Khan, Irfan Ullah
    Aslam, Nida
    Anwar, Talha
    Aljameel, Sumayh S.
    Ullah, Mohib
    Khan, Rafiullah
    Rehman, Abdul
    Akhtar, Nadeem
    COMPLEXITY, 2021, 2021
  • [8] Cervical Cancer Diagnosis Model Using Extreme Gradient Boosting and Bioinspired Firefly Optimization
    Khan, Irfan Ullah
    Aslam, Nida
    Alshehri, Rawan
    Alzahrani, Seham
    Alghamdi, Manal
    Almalki, Atheer
    Balabeed, Maryam
    SCIENTIFIC PROGRAMMING, 2021, 2021
  • [9] Sustainable biofabrication: from bioprinting to AI-driven predictive methods
    Filippi, Miriam
    Mekkattu, Manuel
    Katzschmann, Robert K.
    TRENDS IN BIOTECHNOLOGY, 2025, 43 (02) : 290 - 303
  • [10] AI-driven predictive modeling for disease prevention and early detection
    Behera, Bikash
    Irshad, Azeem
    Rida, Imad
    Shabaz, Mohammad
    SLAS TECHNOLOGY, 2025, 31