Machine Learning Approaches for Stroke Risk Prediction: Findings from the Suita Study

被引:2
|
作者
Vu, Thien [1 ,2 ,3 ]
Kokubo, Yoshihiro [2 ]
Inoue, Mai [1 ,2 ]
Yamamoto, Masaki [1 ,2 ]
Mohsen, Attayeb [1 ]
Martin-Morales, Agustin [1 ,2 ]
Inoue, Takao [4 ]
Dawadi, Research [1 ,2 ]
Araki, Michihiro [1 ,2 ,5 ,6 ]
机构
[1] Natl Inst Biomed Innovat Hlth & Nutr, Artificial Intelligence Ctr Hlth & Biomed Res, 3-17 Senrioka shinmachi, Settsu 5660002, Japan
[2] Natl Cerebral & Cardiovasc Ctr, 6-1 Kishibe Shinmachi, Suita, Osaka 5648565, Japan
[3] Cho Ray Hosp, Cardiovasc Ctr, Dept Vasc Surg, Ho Chi Minh City 72713, Vietnam
[4] Yamato Univ, Fac Informat, 2-5-1 Katayama, Suita 5640082, Japan
[5] Kyoto Univ, Grad Sch Med, Dept Resp Med, 54 Shogoin Kawahara cho,Sakyo ku, Kyoto 6068507, Japan
[6] Kobe Univ, Grad Sch Sci Technol & Innovat, 1-1 Rokkodai Cho,Nada Ku, Kobe 6578501, Japan
基金
日本科学技术振兴机构;
关键词
stroke; supervised machine learning; unsupervised machine learning; logistic regression; random forest; support vector machine (SVM); extreme gradient boost (XGBoost); light gradient boosted machine (LightGBM); k-prototype clustering; Shapley Additive Explanations (SHAP); JAPANESE URBAN COHORT; CARDIOVASCULAR-DISEASE; HEMOGLOBIN CONCENTRATION; ATRIAL-FIBRILLATION; GLYCATED ALBUMIN; ISCHEMIC-STROKE; BLOOD-PRESSURE; ASSOCIATION; INCIDENT; FRUCTOSAMINE;
D O I
10.3390/jcdd11070207
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Stroke constitutes a significant public health concern due to its impact on mortality and morbidity. This study investigates the utility of machine learning algorithms in predicting stroke and identifying key risk factors using data from the Suita study, comprising 7389 participants and 53 variables. Initially, unsupervised k-prototype clustering categorized participants into risk clusters, while five supervised models including Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosted Machine (LightGBM) were employed to predict stroke outcomes. Stroke incidence disparities among identified risk clusters using the unsupervised k-prototype clustering method are substantial, according to the findings. Supervised learning, particularly RF, was a preferable option because of the higher levels of performance metrics. The Shapley Additive Explanations (SHAP) method identified age, systolic blood pressure, hypertension, estimated glomerular filtration rate, metabolic syndrome, and blood glucose level as key predictors of stroke, aligning with findings from the unsupervised clustering approach in high-risk groups. Additionally, previously unidentified risk factors such as elbow joint thickness, fructosamine, hemoglobin, and calcium level demonstrate potential for stroke prediction. In conclusion, machine learning facilitated accurate stroke risk predictions and highlighted potential biomarkers, offering a data-driven framework for risk assessment and biomarker discovery.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Lipoprotein(a) Levels and the Risk of Coronary Heart Disease and Stroke: The Suita Study
    Arafa, Ahmed
    Kato, Yuka
    Kokubo, Yoshihiro
    Khairan, Paramita
    Matsumoto, Chisa
    Nakao, Yoko M.
    Kataoka, Yu
    Harada-Shiba, Mariko
    JOURNAL OF ATHEROSCLEROSIS AND THROMBOSIS, 2025,
  • [22] Hyperuricemia and risk of ischemic stroke in an urban area of Japan: The Suita Study
    Miyamoto, Y.
    Watanabe, M.
    Kokubo, Y.
    Higashiyama, A.
    Nishimura, K.
    Takegami, M.
    Nakai, K.
    Nakao, Y.
    Kobayashi, T.
    Watanabe, T.
    Okayama, A.
    Okamura, T.
    INTERNATIONAL JOURNAL OF STROKE, 2014, 9 : 164 - 164
  • [23] A Study of Features Affecting on Stroke Prediction Using Machine Learning
    Songram, Panida
    Jareanpon, Chatklaw
    MULTI-DISCIPLINARY TRENDS IN ARTIFICIAL INTELLIGENCE, 2019, 11909 : 216 - 225
  • [24] Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults
    Chun, Matthew
    Clarke, Robert
    Cairns, Benjamin J.
    Clifton, David
    Bennett, Derrick
    Chen, Yiping
    Guo, Yu
    Pei, Pei
    Lv, Jun
    Yu, Canqing
    Yang, Ling
    Li, Liming
    Chen, Zhengming
    Zhu, Tingting
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (08) : 1719 - 1727
  • [25] Comparison of risk scores for the prediction of stroke in African Americans: Findings from the Jackson Heart Study
    Foraker, Randi E.
    Greiner, Melissa
    Sims, Mario
    Tucker, Katherine L.
    Towfighi, Amytis
    Bidulescu, Aurelian
    Shoben, Abigail B.
    Smith, Sakima
    Talegawkar, Sameera
    Blackshear, Chad
    Wang, Wei
    Hardy, Natalie Chantelle
    O'Brien, Emily
    AMERICAN HEART JOURNAL, 2016, 177 : 25 - 32
  • [26] Machine Learning in Risk Prediction
    Sundstrom, Johan
    Schon, Thomas B.
    HYPERTENSION, 2020, 75 (05) : 1165 - 1166
  • [27] A Risk Score for the Prediction of Atrial Fibrillation in the Japanese Community: The Suita Study
    Kokubo, Yoshihiro
    Watanabe, Makoto
    Higashiyama, Aya
    Nakao, Yoko M.
    Watanabe, Takuya
    Takegami, Misa
    Kusano, Kengo
    Kamakura, Shiro
    Miyamoto, Yoshihiro
    CIRCULATION, 2015, 132
  • [28] Machine learning approaches for the prediction of postoperative complication risk in liver resection patients
    Zeng, Siyu
    Li, Lele
    Hu, Yanjie
    Luo, Li
    Fang, Yuanchen
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (01)
  • [29] Machine learning approaches for the prediction of postoperative complication risk in liver resection patients
    Siyu Zeng
    Lele Li
    Yanjie Hu
    Li Luo
    Yuanchen Fang
    BMC Medical Informatics and Decision Making, 21
  • [30] Machine learning approaches in Covid-19 severity risk prediction in Morocco
    Mariam Laatifi
    Samira Douzi
    Abdelaziz Bouklouz
    Hind Ezzine
    Jaafar Jaafari
    Younes Zaid
    Bouabid El Ouahidi
    Mariam Naciri
    Journal of Big Data, 9