Machine Learning Approaches for Stroke Risk Prediction: Findings from the Suita Study

被引:2
|
作者
Vu, Thien [1 ,2 ,3 ]
Kokubo, Yoshihiro [2 ]
Inoue, Mai [1 ,2 ]
Yamamoto, Masaki [1 ,2 ]
Mohsen, Attayeb [1 ]
Martin-Morales, Agustin [1 ,2 ]
Inoue, Takao [4 ]
Dawadi, Research [1 ,2 ]
Araki, Michihiro [1 ,2 ,5 ,6 ]
机构
[1] Natl Inst Biomed Innovat Hlth & Nutr, Artificial Intelligence Ctr Hlth & Biomed Res, 3-17 Senrioka shinmachi, Settsu 5660002, Japan
[2] Natl Cerebral & Cardiovasc Ctr, 6-1 Kishibe Shinmachi, Suita, Osaka 5648565, Japan
[3] Cho Ray Hosp, Cardiovasc Ctr, Dept Vasc Surg, Ho Chi Minh City 72713, Vietnam
[4] Yamato Univ, Fac Informat, 2-5-1 Katayama, Suita 5640082, Japan
[5] Kyoto Univ, Grad Sch Med, Dept Resp Med, 54 Shogoin Kawahara cho,Sakyo ku, Kyoto 6068507, Japan
[6] Kobe Univ, Grad Sch Sci Technol & Innovat, 1-1 Rokkodai Cho,Nada Ku, Kobe 6578501, Japan
基金
日本科学技术振兴机构;
关键词
stroke; supervised machine learning; unsupervised machine learning; logistic regression; random forest; support vector machine (SVM); extreme gradient boost (XGBoost); light gradient boosted machine (LightGBM); k-prototype clustering; Shapley Additive Explanations (SHAP); JAPANESE URBAN COHORT; CARDIOVASCULAR-DISEASE; HEMOGLOBIN CONCENTRATION; ATRIAL-FIBRILLATION; GLYCATED ALBUMIN; ISCHEMIC-STROKE; BLOOD-PRESSURE; ASSOCIATION; INCIDENT; FRUCTOSAMINE;
D O I
10.3390/jcdd11070207
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Stroke constitutes a significant public health concern due to its impact on mortality and morbidity. This study investigates the utility of machine learning algorithms in predicting stroke and identifying key risk factors using data from the Suita study, comprising 7389 participants and 53 variables. Initially, unsupervised k-prototype clustering categorized participants into risk clusters, while five supervised models including Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosted Machine (LightGBM) were employed to predict stroke outcomes. Stroke incidence disparities among identified risk clusters using the unsupervised k-prototype clustering method are substantial, according to the findings. Supervised learning, particularly RF, was a preferable option because of the higher levels of performance metrics. The Shapley Additive Explanations (SHAP) method identified age, systolic blood pressure, hypertension, estimated glomerular filtration rate, metabolic syndrome, and blood glucose level as key predictors of stroke, aligning with findings from the unsupervised clustering approach in high-risk groups. Additionally, previously unidentified risk factors such as elbow joint thickness, fructosamine, hemoglobin, and calcium level demonstrate potential for stroke prediction. In conclusion, machine learning facilitated accurate stroke risk predictions and highlighted potential biomarkers, offering a data-driven framework for risk assessment and biomarker discovery.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Early childhood caries risk prediction using machine learning approaches in Bangladesh
    Hasan, Fardous
    El Tantawi, Maha
    Haque, Farzana
    Folayan, Morenike Oluwatoyin
    Virtanen, Jorma I.
    BMC ORAL HEALTH, 2025, 25 (01):
  • [32] Machine learning approaches in Covid-19 severity risk prediction in Morocco
    Laatifi, Mariam
    Douzi, Samira
    Bouklouz, Abdelaziz
    Ezzine, Hind
    Jaafari, Jaafar
    Zaid, Younes
    El Ouahidi, Bouabid
    Naciri, Mariam
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [33] Fall risk prediction using temporal gait features and machine learning approaches
    Lim, Zhe Khae
    Connie, Tee
    Goh, Michael Kah Ong
    Saedon, Nor 'Izzati Binti
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7
  • [34] Hypertension risk prediction models for patients with diabetes based on machine learning approaches
    Zhao, Yuxue
    Han, Jiashu
    Hu, Xinlin
    Hu, Bo
    Zhu, Hui
    Wang, Yanlong
    Zhu, Xiuli
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (20) : 59085 - 59102
  • [35] Detection of unexpected findings in radiology reports: A comparative study of machine learning approaches
    Lopez-Ubeda, Pilar
    Carlos Diaz-Galiano, Manuel
    Martin-Noguerol, Teodoro
    Urena-Lopez, Alfonso
    Martin-Valdivia, Maria-Teresa
    Luna, Antonio
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 160
  • [36] Prediction of prokaryotic transposases from protein features with machine learning approaches
    Wang, Qian
    Ye, Jun
    Xu, Teng
    Zhou, Ning
    Lu, Zhongqiu
    Ying, Jianchao
    MICROBIAL GENOMICS, 2021, 7 (07):
  • [37] Machine learning approaches to lung cancer prediction from mass spectra
    Hilario, M
    Kalousis, A
    Müller, M
    Pellegrini, C
    PROTEOMICS, 2003, 3 (09) : 1716 - 1719
  • [38] Machine learning based approaches for age and gender prediction from tweets
    Katna, Rishabh
    Kalsi, Kashish
    Gupta, Srajika
    Yadav, Divakar
    Yadav, Arun Kumar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (19) : 27799 - 27817
  • [39] Machine learning based approaches for age and gender prediction from tweets
    Rishabh Katna
    Kashish Kalsi
    Srajika Gupta
    Divakar Yadav
    Arun Kumar Yadav
    Multimedia Tools and Applications, 2022, 81 : 27799 - 27817
  • [40] HEART FAILURE RISK PREDICTION USING AZURE DATA LAKE ARCHITECTURE WITH AUTOMATED MACHINE LEARNING AND MACHINE LEARNING APPROACHES
    Alghamdi, Ahmed M.
    Al Shehri, Waleed
    Almalki, Jameel
    Jannah, Najlaa
    Bahaddad, Adel
    Bokhary, Abdullah M.
    THERMAL SCIENCE, 2024, 28 (6B): : 5059 - 5069