Machine Learning Approaches for Stroke Risk Prediction: Findings from the Suita Study

被引:2
|
作者
Vu, Thien [1 ,2 ,3 ]
Kokubo, Yoshihiro [2 ]
Inoue, Mai [1 ,2 ]
Yamamoto, Masaki [1 ,2 ]
Mohsen, Attayeb [1 ]
Martin-Morales, Agustin [1 ,2 ]
Inoue, Takao [4 ]
Dawadi, Research [1 ,2 ]
Araki, Michihiro [1 ,2 ,5 ,6 ]
机构
[1] Natl Inst Biomed Innovat Hlth & Nutr, Artificial Intelligence Ctr Hlth & Biomed Res, 3-17 Senrioka shinmachi, Settsu 5660002, Japan
[2] Natl Cerebral & Cardiovasc Ctr, 6-1 Kishibe Shinmachi, Suita, Osaka 5648565, Japan
[3] Cho Ray Hosp, Cardiovasc Ctr, Dept Vasc Surg, Ho Chi Minh City 72713, Vietnam
[4] Yamato Univ, Fac Informat, 2-5-1 Katayama, Suita 5640082, Japan
[5] Kyoto Univ, Grad Sch Med, Dept Resp Med, 54 Shogoin Kawahara cho,Sakyo ku, Kyoto 6068507, Japan
[6] Kobe Univ, Grad Sch Sci Technol & Innovat, 1-1 Rokkodai Cho,Nada Ku, Kobe 6578501, Japan
基金
日本科学技术振兴机构;
关键词
stroke; supervised machine learning; unsupervised machine learning; logistic regression; random forest; support vector machine (SVM); extreme gradient boost (XGBoost); light gradient boosted machine (LightGBM); k-prototype clustering; Shapley Additive Explanations (SHAP); JAPANESE URBAN COHORT; CARDIOVASCULAR-DISEASE; HEMOGLOBIN CONCENTRATION; ATRIAL-FIBRILLATION; GLYCATED ALBUMIN; ISCHEMIC-STROKE; BLOOD-PRESSURE; ASSOCIATION; INCIDENT; FRUCTOSAMINE;
D O I
10.3390/jcdd11070207
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Stroke constitutes a significant public health concern due to its impact on mortality and morbidity. This study investigates the utility of machine learning algorithms in predicting stroke and identifying key risk factors using data from the Suita study, comprising 7389 participants and 53 variables. Initially, unsupervised k-prototype clustering categorized participants into risk clusters, while five supervised models including Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosted Machine (LightGBM) were employed to predict stroke outcomes. Stroke incidence disparities among identified risk clusters using the unsupervised k-prototype clustering method are substantial, according to the findings. Supervised learning, particularly RF, was a preferable option because of the higher levels of performance metrics. The Shapley Additive Explanations (SHAP) method identified age, systolic blood pressure, hypertension, estimated glomerular filtration rate, metabolic syndrome, and blood glucose level as key predictors of stroke, aligning with findings from the unsupervised clustering approach in high-risk groups. Additionally, previously unidentified risk factors such as elbow joint thickness, fructosamine, hemoglobin, and calcium level demonstrate potential for stroke prediction. In conclusion, machine learning facilitated accurate stroke risk predictions and highlighted potential biomarkers, offering a data-driven framework for risk assessment and biomarker discovery.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Prediction of Stroke Incidence Using Machine Learning: The Suita Study
    Thien Vu
    Inoue, Mai
    Yamamoto, Masaki
    Mohsen, Attayeb
    Martin-Morales, Agustin
    Inoue, Takao
    Dawadi, Rsch
    Kokubo, Yoshihiro
    Araki, Michihiro
    STROKE, 2024, 55
  • [2] Developing a Stroke Risk Prediction Model Using Cardiovascular Risk Factors: The Suita Study
    Arafa, Ahmed
    Kokubo, Yoshihiro
    Sheerah, Haytham A.
    Sakai, Yukie
    Watanabe, Emi
    Li, Jiaqi
    Honda-Kohmo, Kyoko
    Teramoto, Masayuki
    Kashima, Rena
    Nakao, Yoko M.
    Koga, Masatoshi
    CEREBROVASCULAR DISEASES, 2022, 51 (03) : 323 - 330
  • [3] Stroke Risk Prediction with Machine Learning Techniques
    Dritsas, Elias
    Trigka, Maria
    SENSORS, 2022, 22 (13)
  • [4] Machine Learning and the Conundrum of Stroke Risk Prediction
    Chahine, Yaacoub
    Magoon, Matthew J.
    Maidu, Bahetihazi
    del Alamo, Juan C.
    Boyle, Patrick M.
    Akoum, Nazem
    ARRHYTHMIA & ELECTROPHYSIOLOGY REVIEW, 2023, 12
  • [5] COMPARISON OF DIFFERENT MACHINE LEARNING APPROACHES TO MODEL STROKE SUBTYPE CLASSIFICATION AND RISK PREDICTION
    Garcia-Terriza, Luis
    Risco-Martin, Jose L.
    Ayala, Jose L.
    Reig Rosello, Gemma
    Camarasaltas, Juan M.
    2019 SPRING SIMULATION CONFERENCE (SPRINGSIM), 2019,
  • [6] Prediction of depressive disorder using machine learning approaches: findings from the NHANES
    Vu, Thien
    Dawadi, Research
    Yamamoto, Masaki
    Tay, Jie Ting
    Watanabe, Naoki
    Kuriya, Yuki
    Oya, Ai
    Tran, Phap Ngoc Hoang
    Araki, Michihiro
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2025, 25 (01)
  • [7] Comparison of Machine Learning Approaches in Prediction of Osteoporosis Risk
    Qiu, Chuan
    JOURNAL OF BONE AND MINERAL RESEARCH, 2023, 38 : 159 - 160
  • [8] Revisiting CVD Risk Prediction Using Machine Learning Approaches: A Case Study
    Dashti, Hesam
    Liu, Yanyan
    Glynn, Robert J.
    Ridker, Paul M.
    Mora, Samia
    Demler, Olga
    CIRCULATION, 2020, 141
  • [9] Cardiovascular risk prediction: from classical statistical methods to machine learning approaches
    Sperti, Michela
    Malavolta, Marta
    Polacco, Federica Staunovo
    Dellavalle, Annalisa
    Ruggieri, Rossella
    Bergia, Sara
    Fazio, Alice
    Santoro, Carmine
    Deriu, Marco A.
    MINERVA CARDIOLOGY AND ANGIOLOGY, 2022, 70 (01) : 102 - 122
  • [10] From Imputation to Prediction: A Comprehensive Machine Learning Pipeline for Stroke Risk Analysis
    Padmakala, S.
    Chandrasekar, A.
    2024 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND APPLIED INFORMATICS, ACCAI 2024, 2024,